A Framework and Architecture for Quality Assessment in Data Integration

Abstract

Data integration aims to combine distributed information sources conforming to different modelling methods and provide interfaces for accessing the integrated resource. Data integration processes may be complex and errorprone because of the heterogeneities of the information sources. Moreover, data integration is a collaborative task involving many people with different levels of experience, knowledge of the application domain, and expectations. It is difficult to determine and control the quality of a data integration setting due to these factors. In this thesis, we investigate the methods of improving the quality of integrated resources with respect to the users’ requirements, in an iterative integration process. We propose a quality framework that is capable of representing different quality requirements arising from stakeholders involved in the integration process. Ontology-based inferencing within this quality framework allows the data integrator to identify amendments to the integrated resource so as to satisfy users’ quality expectations better. We define several quality criteria and factors specific to the context of data integration and propose a number of quality metrics for measuring these quality factors. We propose a data integration methodology that supports quality assessment of the integrated resource and an integration architecture for the realisation of this methodology. We show how the quality of an integrated resource could be improved using our quality framework, quality criteria and factors, and data integration methodology using a real-world case study.

65 Figures and Tables

Cite this paper

@inproceedings{Wang2012AFA, title={A Framework and Architecture for Quality Assessment in Data Integration}, author={Jianing Wang}, year={2012} }