A Masking Index for Quantifying Hidden Glitches

@inproceedings{Bertiquille2013AMI,
  title={A Masking Index for Quantifying Hidden Glitches},
  author={Laure Berti-{\'E}quille and J. Loh and T. Dasu},
  booktitle={ICDM},
  year={2013}
}
Data glitches are errors in a data set, they are complex entities that often span multiple attributes and records. When they co-occur in data, the presence of one type of glitch can hinder the detection of another type of glitch. This phenomenon is called masking. In this paper, we define two important types of masking, and we propose a novel, statistically rigorous indicator called masking index for quantifying the hidden glitches in four cases of masking: outliers masked by missing values… Expand
2 Citations
Just-in-time Analytics Over Heterogeneous Data and Hardware
Data Quality: The Role of Empiricism
  • 16
  • PDF

References

SHOWING 1-10 OF 18 REFERENCES
A Meta analysis study of outlier detection methods in classification
  • 96
  • PDF
Enhancing data analysis with noise removal
  • 204
  • PDF
IDENTIFICATION OF OUTLIERS: A SIMULATION STUDY
  • 19
  • PDF
Discovery of complex glitch patterns: A novel approach to Quantitative Data Cleaning
  • 55
Duplicate Record Detection: A Survey
  • 1,611
  • PDF
The identification of multiple outliers
  • 386
Statistical Distortion: Consequences of Data Cleaning
  • 76
  • PDF
Outliers in Statistical Data
  • 3,340
...
1
2
...