Detecting changes in XML documents

@article{Cobena2002DetectingCI,
  title={Detecting changes in XML documents},
  author={Gregory Cobena and Serge Abiteboul and Am{\'e}lie Marian},
  journal={Proceedings 18th International Conference on Data Engineering},
  year={2002},
  pages={41-52}
}
We present a diff algorithm for XML data. [...] Key Method Also, it considers, besides insertions, deletions and updates (standard in diffs), a move operation on subtrees that is essential in the context of XML. Intuitively, our diff algorithm uses signatures to match (large) subtrees that were left unchanged between the old and new versions. Such exact matchings are then possibly propagated to ancestors and descendants to obtain more matchings. It also uses XML specific information such as ID attributes. We…Expand
X-Diff: an effective change detection algorithm for XML documents
  • Y. Wang, D. DeWitt, Jin-Yi Cai
  • Computer Science
  • Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405)
  • 2003
TLDR
This work proposes X-Diff, an effective algorithm that integrates key XML structure characteristics with standard tree-to-tree correction techniques and argues that an unordered model (only ancestor relationships are significant) is more suitable for most database applications. Expand
A comparative study of XML diff tools
The success of XML has recently renewed interest in change control on trees and semi-structured data. This is motivated, for instance, by the need to manage versions of documents, to query andExpand
A change detection system for unordered XML data using a relational model
TLDR
An efficient algorithm is proposed (XRel_Change_SQL) for detecting unordered changes between two XML data files stored in XRel as the underlying relational data model, using Structured Query Language (SQL). Expand
Change Detection in XML Documents for Fixed Structures using Exclusive-Or (XOR)
TLDR
This paper proposes an automatic change detection algorithm which will identify changes between two versions of an XML document based on these signatures using XOR, and demonstrates that the algorithm outperforms the traditional algorithm which exhaustively searches the entire space. Expand
Using versioned trees, change detection and node identity for three-way XML merging
  • C. Thao, E. Munson
  • Computer Science
  • SICS Software-Intensive Cyber-Physical Systems
  • 2013
TLDR
A three-way XML merge algorithm that is faster, uses less memory and is more precise than previous algorithms, which uses a specialized versioning tree data structure that supports node identity and change detection. Expand
On Change Detection of XML Schemas
TLDR
This paper uses the technique of storing XML Schema versions in a relational database where the detection and storage of delta changes are employed on relational tables and shows that XS-Diff, is a more meaningful method than other change detection methods for providing deltas that are optimal or near-optimal and semantically correct. Expand
Structural Similarity Evaluation Between XML Documents and DTDs
TLDR
An algorithm for measuring the structural similarity between an XML document and a Document Type Definition (DTD) considered as the simplest way for specifying structural constraints on XML documents is proposed. Expand
Xandy: A scalable change detection technique for ordered XML documents using relational databases
TLDR
A prototype system called XANDY is implemented that converts XML documents into relational tuples and detects the changes from these tuples by using SQL queries and the experimental results show that the relational-based approach has better scalability compared to published algorithm like X-Diff. Expand
A structural similarity measure for XML documents: theory and applications
TLDR
A measure for evaluating the structural similarity between an XML document and a DTD is proposed that addresses the fundamental requirements for access protection of XML documents: varying granularity levels of protection ranging from a single element of a document to a set of documents. Expand
Using versioned tree data structure, change detection and node identity for three-way XML merging
TLDR
An implementation of a three-way XML merge algorithm that is faster, uses less memory and is more precise than existing tools is presented and a graphical interface for visualizing and resolving conflicts is provided. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Monitoring XML data on the Web
TLDR
The monitoring used in a very large warehouse built from XML documents found on the web, which consists in XML pages that are warehoused and HTML pages which are not, is presented. Expand
Meaningful change detection in structured data
TLDR
This paper presents a heuristic change detection algorithm that yields close to “minimal” descriptions of the changes, and that has fewer restrictions than previous algorithms. Expand
Change-Centric Management of Versions in an XML Warehouse
TLDR
The foundations of the logical representation and some aspects of the physical storage policy are presented and the implementation of the change-centric method to manage versions in a Web WareHouse of XML data is discussed. Expand
Change detection in hierarchically structured information
TLDR
This work defines the hierarchical change detection problem as the problem of finding a "minimum-cost edit script" that transforms one data tree to another, and presents efficient algorithms for computing such an edit script. Expand
Querying XML Documents in Xyleme
TLDR
The Xyleme project proposes to study and build a dynamic World Wide XML warehouse, i.e., a data warehouse capable of storing all the XML data available on the planet. Expand
eXtensible Markup Language (XML) 1.0 (Second Edition)
The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the wayExpand
Efficient Filtering of XML Documents for Selective Dissemination of Information
TLDR
This paper has developed several index organizations and search algorithms for performing efficient filtering of XML documents for large-scale information dissemination systems and examines their performance across a range of document, workload, and scale scenarios. Expand
Efficient Snapshot Differential Algorithms for Data Warehousing
TLDR
Algorithms that perform (possibly lossy) compression of records and a algorithm that works very well if the snapshots are not ``very different. Expand
A System for Approximate Tree Matching
TLDR
This paper presents a system, called approximate-tree-by-example (ATBE), which allows inexact matching of trees, and describes the architecture of ATBE, its use and describes some aspects ofATBE implementation. Expand
Alignment of Trees - An Alternative to Tree Edit
TLDR
The alignment of trees is proposed as a measure of the similarity between two labeled trees and it is shown that the alignment problem can be solved in polynomial time if the trees have a bounded degree and becomes MAX SNP-hard if one of the trees is allowed to have an arbitrary degree. Expand
...
1
2
3
4
...