Approximate Data Exchange
@inproceedings{Rougemont2007ApproximateDE, title={Approximate Data Exchange}, author={Michel de Rougemont and Adrien Vieilleribi{\`e}re}, booktitle={ICDT}, year={2007} }
We introduce approximate data exchange, by relaxing classical data exchange problems such as Consistency and Typechecking to their approximate versions based on Property Testing. It provides a natural framework for consistency and safety questions, which first considers approximate solutions and then exact solutions obtained with a Corrector.
We consider a model based on transducers of words and trees, and study e-Consistency, i.e., the problem of deciding whether a given source instance I…
18 Citations
Data exchange in the presence of arithmetic comparisons
- Computer ScienceEDBT '08
- 2008
A novel chase procedure called AC-chase is defined which is a tree and it is proved that it produces a universal solution (appropriately defined to deal with arithmetic comparisons), which is the right tool for query answering in the case of unions of CQACs.
Data Exchange with Arithmetic Comparisons ∗
- Computer Science
- 2007
It is shown that AC-chase computes a universal solution which can be used to compute certain answers for unions of conjunctive queries with arithmetic comparisons (UCQAC), and the complexity of existence of a solution is shown to be in NP.
Approximate Structural Consistency
- Computer ScienceSOFSEM
- 2010
An approximate algorithm is described which decides if I is close to a target regular schema (DTD) and this property is testable, i.e. can be solved in time independent of the size of the input document, by just sampling I.
Approximate consistency for transformations on words and trees
- Computer Science, MathematicsTheor. Comput. Sci.
- 2016
Approximate Membership for Words and Trees
- Computer Science
- 2014
An objective is to obtain sublinear algorithms for deciding properties of XML databases approximatively by investigating the properties of whether an unranked tree is valid for a DTD, or more generally, whether it is recognized by a tree automaton.
Providing best-effort services in dataspace systems
- Computer Science
- 2007
This dissertation studies how to provide best-effort search, querying and browsing services in a dataspace system, even when precise schema mappings are not present, and proposes the concept of probabilistic schema mapping, with which it can return approximate answers even if precise mappings do not exist.
Query Relaxation across Heterogeneous Data Sources
- Computer ScienceCIKM
- 2015
This paper proposes a technique to compute query relaxations of an input query that can be rewritten and evaluated in an environment of collaborating autonomous and heterogeneous data sources, and proposes both an exhaustive and an optimized heuristic algorithm to compute and evaluate these relaxations.
Approximate Validity of XML Streaming Data
- Computer Science2008 The Ninth International Conference on Web-Age Information Management
- 2008
A SAX implementation of the statistical embedding associated with XML data allows to efficiently decide eps-validity to any DTD or Schema, for the Edit Distance with Moves and associates a generalized k-gram to unranked labelled trees from which any regular property can be approximately decided.
Approximate Queries on Big Heterogeneous Data
- Computer Science2015 IEEE International Congress on Big Data
- 2015
Traditional techniques for query rewriting are extended, and heuristic algorithms to compute query approximations of an input query that can be rewritten and evaluated in an environment of collaborating autonomous and heterogeneous big data sources are proposed.
Data integration with uncertainty
- Computer ScienceThe VLDB Journal
- 2008
The concept of probabilistic schema mappings is introduced and it is shown that there are two possible semantics for such mappings: by-table semantics assumes that there exists a correct mapping but the author does not know what it is; by-tuple semantics assuming that the correct mapping may depend on the particular tuple in the source data.
References
SHOWING 1-10 OF 19 REFERENCES
Data exchange: semantics and query answering
- Computer ScienceTheor. Comput. Sci.
- 2005
This paper gives an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that is called universal and shows that a universal solution has no more and no less data than required for data exchange and that it represents the entire space of possible solutions.
XML data exchange: consistency and query answering
- Computer SciencePODS '05
- 2005
This paper starts looking into the basic properties of XML data exchange, that is, restructuring of XML documents that conform to a source DTD under a target DTD, and answering queries written over the target schema, and proves a dichotomy theorem that classifies data exchange settings into those over which query answering is tractable, and those overWhich it is coNP-complete.
Approximate Satisfiability and Equivalence
- Computer Science, Mathematics21st Annual IEEE Symposium on Logic in Computer Science (LICS'06)
- 2006
The geometrical embedding is extended to extend the geometric embedding, and hence the tester algorithms, to infinite regular languages and to context-free languages, and can also test the equivalence between two regular properties on words, defined by monadic second order formulas.
Regular languages are testable with a constant number of queries
- Computer Science40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)
- 1999
This paper discusses testability of more complex languages and shows that the query complexity required for testing context free languages cannot be bounded by any function of /spl epsiv/.
Word problems requiring exponential time(Preliminary Report)
- Computer Science, MathematicsSTOC
- 1973
A number of similar decidable word problems from automata theory and logic whose inherent computational complexity can be precisely characterized in terms of time or space requirements on deterministic or nondeterministic Turing machines are considered.
Composing schema mappings: Second-order dependencies to the rescue
- Computer Science, MathematicsTODS
- 2005
It is shown that the composition of finite sets of source-to-target tgds is always definable by a second-order tgd, and that second-orders possess good properties for data exchange, and introduces a class of existential second- order formulas with function symbols, which are made a case that they are the "right" language for composing schema mappings.
Correctors for XML Data
- Computer ScienceXSym
- 2004
It is shown how testers and correctors for regular trees can be used to estimate distances between a document and a set of DTDs, a useful operation to rank XML documents.
Property testing and its connection to learning and approximation
- Computer ScienceJACM
- 1998
The authors study the question of determining whether an unknown function has a particular property or is /spl epsiv/-far from any function with that property, and devise algorithms to test whether a graph has properties such as being k-colorable or having a /spl rho/-clique.
XML stream processing using tree-edit distance embeddings
- Computer ScienceTODS
- 2005
These are the first algorithmic results on low-distortion embeddings for tree-edit distance metrics, and on correlating XML data in the streaming model.
Robust Characterizations of Polynomials with Applications to Program Testing
- Computer ScienceSIAM J. Comput.
- 1996
The characterizations provide results in the area of coding theory by giving extremely fast and efficient error-detecting schemes for some well-known codes and play a crucial role in subsequent results on the hardness of approximating some NP-optimization problems.