Learn More
Establishing semantic interoperability among heterogeneous information sources has been a critical issue in the database community for the past two decades. Despite the critical importance, current approaches to semantic interoperability of heterogeneous databases have not been sufficiently effective. We propose a common ontology called semantic conflict(More)
The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise(More)
Data Provenance refers to the " origin " , " lineage " , and " source " of data. In this work, we examine provenance from a semantics perspective and present the W7 model, an ontological model of data provenance. In the W7 model, provenance is conceptualized as a combination of seven interconnected elements including " what " , " when " , " where " , " how(More)
Asthma is one of the most prevalent and costly chronic conditions in the United States, which cannot be cured. However, accurate and timely surveillance data could allow for timely and targeted interventions at the community or individual level. Current national asthma disease surveillance systems can have data availability lags of up to two weeks. Rapid(More)
The quality of Wikipedia articles is debatable. On the one hand, existing research indicates that not only are people willing to contribute articles but the quality of these articles is close to that found in conventional encyclopedias. On the other hand, the public has never stopped criticizing the quality of Wikipedia articles, and critics never have(More)
Entity identification, i.e., detecting semantically corresponding records from heterogeneous data sources, is a critical step in integrating the data sources. The objective of this research is to develop and evaluate a novel multiple classifier system approach that improves entity identification accuracy. We apply various classification techniques drawn(More)
While many real-world applications need to organize data based on space (e.g., geology, geomarketing, environmental modeling) and/or time (e.g., accounting, inventory management, personnel management), existing conventional conceptual models do not provide a straightforward mechanism to explicitly capture the associated spatial and temporal semantics. As a(More)