Schema merging and mapping creation for relational sources

@inproceedings{Pottinger2008SchemaMA,
  title={Schema merging and mapping creation for relational sources},
  author={Rachel Pottinger and Philip A. Bernstein},
  booktitle={EDBT '08},
  year={2008}
}
We address the problem of generating a mediated schema from a set of relational data source schemas and conjunctive queries that specify where those schemas overlap. Unlike past approaches that generate only the mediated schema, our algorithm also generates view definitions, i.e., source-to-mediated schema mappings. Our main goal is to understand the requirements that a mediated schema and views should satisfy, such as completeness, preservation of overlapping information, normalization, and… 

Figures from this paper

Merging Relational Views: A Minimization Approach
TLDR
This work builds upon previous approaches that address relational view integration using logical mapping constraints and produces a minimal information-preserving mediated schema with constraints, and it generates output mappings defining the source schemas as views.
Automatic schema merging using mapping constraints among incomplete sources
TLDR
This paper presents a novel approach for merging multiple relational data sources related by a collection of mapping constraints in the form of P2P style tuple-generating dependencies (tgds) and proposes a merging algorithm following a redundancy reduction paradigm and proves that the output satisfies the desired logical properties.
Towards A Unified Framework For Schema Merging
TLDR
This work presents a novel logical framework for merging multiple relational schemas related via a collection of mapping constraints in the form of tuple-generating dependencies (tgds) to address the challenge of creating a mediated query interface for data integration systems.
Constraint driven schema merging
TLDR
A schema minimization approach is developed which generates minimal mediated schemas with the same query answering capacity as the source schemas and identified syntactical constraints on the input mappings which ensure that the proposed algorithms are in PTIME.
Interactive generation of integrated schemas
TLDR
An algorithm is developed that can systematically output, without duplication, all possible integrated schemas resulting from the previous choices, and facilitates the selection of the final integrated schema.
Automatic generation of mediated schemas through reasoning over data dependencies
TLDR
A novel system which is able to perform native n-ary schema merging using P2P style tgds as input and opts for a minimal schema signature retaining all certain answers of conjunctive queries is presented.
Schema Matching and Schema Merging based on Uncertain Semantic Mappings
TLDR
A schema integration framework which is only concerned with semantic mappings (that associate schema objects based on simple set based comparisons of the objects’ instances) and which explicitly represents and manages the uncertainty as to which semantic relationship is the correct one to use in any mapping is proposed.
The manipulation of schematic correspondences with the quantification of uncertainty in dataspaces
TLDR
This thesis proposes techniques for quantifying uncertainty in the equivalence of schema constructs from evidence in the form of similarity scores and user feedback, and provides a flexible framework for incrementally updating the uncertainties in the light of new evidence.
Tuned schema merging (TuSMe)
  • Gul Jabeen, N. Masood
  • Computer Science
    2011 5th International Conference on Software, Knowledge Information, Industrial Management and Applications (SKIMA) Proceedings
  • 2011
TLDR
The tuned schema merging (TuSMe) is an effort to develop a balanced GCS that will control its horizontal and vertical expansion.
Top-k generation of integrated schemas based on directed and weighted correspondences
TLDR
This paper proposes a more automatic approach to schema integration that is based on the use of directed and weighted correspondences between the concepts that appear in the source schemas and shows that the algorithm runs in polynomial time and has good performance in practice.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Processing queries and merging schemas in support of data integration
TLDR
This thesis presents the MiniCon algorithm for answering queries in a data integration system and explains why MiniCon outperforms previous algorithms by up to several orders of magnitude.
On the Expressive Power of Data Integration Systems
TLDR
This paper introduces the notion of query-preserving transformation, and query-reducibility between data integration systems, and shows that, when no integrity constraints are allowed in global schema, the LAV and the GAV approaches are incomparable.
Supporting executable mappings in model management
TLDR
A semantics for model-management operators is developed that allows applying the operators to executable mappings and is language-independent: the effect of the operators is expressed in terms of what they do to the instances of models and mappings.
Theoretical Aspects of Schema Merging
A general technique for merging database schemas is developed that has a number of advantages over existing techniques, the most important of which is that schemas are placed in a partial order that
Answering queries using views: A survey
  • A. Halevy
  • Computer Science
    The VLDB Journal
  • 2001
TLDR
The state of the art on the problem of answering queries using views is surveyed, the algorithms proposed to solve it are described, and the disparate works into a coherent framework are synthesized.
Data integration by bi-directional schema transformation rules
  • P. McBrien, A. Poulovassilis
  • Computer Science
    Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405)
  • 2003
TLDR
A new approach to data integration is described which subsumes the previous approaches of local as view (LAV) and global asView (GAV) and is based on the use of reversible schema transformation sequences.
Answering queries using views (extended abstract)
TLDR
It is shown that all the possible rewritings can be obtained by considering cent ainment mappings from the views to the query, and that the problems considered are NP-complete when both the query and the views are conjunctive and don’t involve builtin comparison predicates.
Relative Information Capacity of Simple Relational Database Schemata
  • R. Hull
  • Computer Science
    SIAM J. Comput.
  • 1986
TLDR
The informal notion of relative information capacity often suggested in the conceptual database literature, which is based on accessibility of data via queries, is indicated that this notion is too general to accurately measure whether an underlying semantic connection exists between database schemata.
The Use of Information Capacity in Schema Integration and Translation
TLDR
A classification of common integration and translation tasks based on their operational goals and derive from them the relative information capacity requirements of the original and transformed schemas shows that for many tasks, information capacity equivalence of the schemas is not strictly required.
MiniCon: A scalable algorithm for answering queries using views
TLDR
The MiniCon is described, a novel algorithm for finding the maximally-contained rewriting of a conjunctive query using a set of conj unctive views and it is shown that the MiniCon scales up well and significantly outperforms the previous algorithms.
...
1
2
3
...