Discovering data quality rules
- Fei Chiang, Renée J. Miller
- Computer ScienceProceedings of the VLDB Endowment
- 1 August 2008
This work proposes a new data-driven tool that can be used within an organization's data quality management process to suggest possible rules, and to identify conformant and non-conformant records.
Framework for Evaluating Clustering Algorithms in Duplicate Detection
- Oktie Hassanzadeh, Fei Chiang, Renée J. Miller, Hyun Chul Lee
- Computer ScienceProceedings of the VLDB Endowment
- 1 August 2009
This work uses Stringer to evaluate the quality of the clusters obtained from several unconstrained clustering algorithms used in concert with approximate join techniques and reveals that some clustering algorithm that have never been considered for duplicate detection, perform extremely well in terms of both accuracy and scalability.
A unified model for data and constraint repair
- Fei Chiang, Renée J. Miller
- Computer ScienceIEEE International Conference on Data Engineering
- 11 April 2011
This work presents a novel unified cost model that allows data and constraint repairs to be compared on an equal footing, and considers repairs over a database that is inconsistent with respect to a set of rules, modeled as functional dependencies (FDs).
Seeking Stable Clusters in the Blogosphere
- Nilesh Bansal, Fei Chiang, N. Koudas, Frank Wm. Tompa
- Economics, Computer ScienceVery Large Data Bases Conference
- 23 September 2007
This paper formalizes intuition and presents efficient algorithms to identify keyword clusters in large collections of blog posts for specific temporal intervals, and formalizes problems related to the temporal properties of such clusters.
Continuous data cleaning
- M. Volkovs, Fei Chiang, Jaroslaw Szlichta, Renée J. Miller
- Computer ScienceIEEE International Conference on Data Engineering
- 19 May 2014
This work introduces a continuous data cleaning framework that can be applied to dynamic data and constraint environments and uses not only the data and constraints as evidence, but also considers the past repairs chosen and applied by a user (user repair preferences).
CONDOR
- Joshua Segeren, Dhruv Gairola, Fei Chiang
- The Invincible
- 3 November 2014
Restoring Consistency in Ontological Multidimensional Data Models via Weighted Repairs
- Enamul Haque, Fei Chiang
- Computer ScienceInternational Conference on Knowledge-Based…
- 2019
CurrentClean: Spatio-Temporal Cleaning of Stale Data
- Mostafa Milani, Zheng Zheng, Fei Chiang
- Computer ScienceIEEE International Conference on Data Engineering
- 8 April 2019
CurrentClean is presented, a probabilistic system for identifying and cleaning stale values that captures the database update patterns to infer stale values, and a set of inference rules that model spatio-temporal update patterns commonly seen in real data.
Ontology-based Entity Matching in Attributed Graphs
- Hanchao Ma, Morteza Alipour Langouri, Yinghui Wu, Fei Chiang, Jiaxing Pi
- Computer ScienceProceedings of the VLDB Endowment
- 1 June 2019
This work proposes a new class of key constraints, Ontological Graph Keys (OGKs) that extend conventional graph keys by ontological subgraph matching between entity labels and an external ontology, and shows that the implication and validation problems for OGKs are each NP-complete.
An Algebraic Approach Towards Data Cleaning
- Ridha Khédri, Fei Chiang, K. Sabri
- Computer ScienceEUSPN/ICTH
- 2013
...
...