Modern database management systems essentially solve the problem of accessing and managing large volumes of related data on a single platform, or on a cluster of tightly-coupled platforms. But many problems remain when two or more databases need to work together. A fundamental problem is raised by semantic heterogeneity the fact that data duplicated across multiple databases is represented differently in the underlying database schemas. This tutorial describes fundamental problems raised by semantic heterogeneity and surveys theoretical frameworks that can provide solutions for them. The tutorial considers the following topics: (1) representative architectures for supporting database interoperation; (2) notions for comparing the “information capacity” of database schemas; (3) providing support for read-only integrated views of data, including the .virtual and materialized approaches; (4) providing support for read-write integrated views of data, including the issue of workflows on heterogeneous databases; and (5) research and tools for accessing and effectively using meta-data, e.g., to identify the relationships between schemas of different databases.
Unfortunately, ACM prohibits us from displaying non-influential references for this paper.
To see the full reference list, please visit http://dl.acm.org/citation.cfm?id=263668.