• Publications
  • Influence
Schema-Free XQuery
Data integration with uncertainty
TLDR
The concept of probabilistic schema mappings is introduced and it is shown that there are two possible semantics for such mappings: by-table semantics assumes that there exists a correct mapping but the author does not know what it is; by-tuple semantics assuming that the correct mapping may depend on the particular tuple in the source data.
Making database systems usable
TLDR
A presentation data model is introduced and direct data manipulation with a schema later approach is recommended and the importance of provenance and of consistency across presentation models is stressed.
REX: Explaining Relationships between Entity Pairs
TLDR
REX is presented, a system that takes a pair of entities in a given knowledge base as input and efficiently identifies a ranked list of relationship explanations, which formally define relationship explanations and analyze their desirable properties.
Web-scale Data Integration: You can only afford to Pay As You Go
TLDR
This paper proposes a new data integration architecture, PAYGO, which is inspired by the concept of dataspaces and emphasizes pay-as-you-go data management as means for achieving web-scale data integration.
Finding related tables
TLDR
This work considers the problem of finding related tables in a large corpus of heterogenous tables and proposes a framework that captures several types of relatedness, including tables that are candidates for joins and tables that is candidates for union.
Constraint-based XML query rewriting for data integration
TLDR
The semantics of query answering in such an integration scenario is defined, and two novel algorithms are designed, basic query rewrite and query resolution, to implement the semantics.
XML schema refinement through redundancy detection and normalization
TLDR
This work presents the design and implementation of the first system, DiscoverXFD, for efficient discovery of XML data redundancies, and introduces a new normal form (GTT-XNF) for XML documents, and provides comprehensive comparisons with previous studies.
Web-Scale Data Integration: You can afford to Pay as You Go
TLDR
This paper proposes a new data integration architecture, PAYGO, which is inspired by the concept of dataspaces and emphasizes pay-as-you-go data management as means for achieving web-scale data integration.
Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together
TLDR
Michigan Molecular Interactions gathers data from well-known protein interaction databases and deep-merges the information, and tracks the provenance of all data to help scientists judge the usefulness of a piece of data.
...
1
2
3
4
5
...