• Publications
  • Influence
Generic Schema Matching with Cupid
Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schemaExpand
  • 1,538
  • 129
  • PDF
Learning to map between ontologies on the semantic web
Ontologies play a prominent role on the Semantic Web. They make possible the widespread publication of machine understandable data, opening myriad opportunities for automated information processing.Expand
  • 1,054
  • 74
  • PDF
Simlarity Search for Web Services
Web services are loosely coupled software components, published, located, and invoked across the web. The growing number of web services available within an organization and on the Web raises a newExpand
  • 815
  • 44
  • PDF
Reference reconciliation in complex information spaces
Reference reconciliation is the problem of identifying when different references (i.e., sets of attribute values) in a dataset correspond to the same real-world entity. Most previous literatureExpand
  • 614
  • 38
  • PDF
Learning to match ontologies on the Semantic Web
Abstract.On the Semantic Web, data will inevitably come from many different ontologies, and information processing across ontologies is not possible without knowing the semantic mappings betweenExpand
  • 530
  • 33
  • PDF
Recovering Semantics of Tables on the Web
The Web offers a corpus of over 100 million tables [6], but the meaning of each table is rarely explicit from the table itself. Header rows exist in few cases and even when they do, the attributeExpand
  • 300
  • 33
  • PDF
Ontology Matching: A Machine Learning Approach
This chapter studies ontology matching: the problem of finding the semantic mappings between two given ontologies. This problem lies at the heart of numerous information processing applications.Expand
  • 517
  • 32
  • PDF
The Piazza peer data management system
Intuitively, data management and data integration tools are well-suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: TheyExpand
  • 305
  • 25
  • PDF
Corpus-based schema matching
Schema matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions haveExpand
  • 407
  • 24
  • PDF
Google's Deep Web crawl
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structured data on the Web,Expand
  • 388
  • 22
  • PDF