On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

  title={On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study},
  author={G. O. Campos and A. Zimek and J. Sander and R. Campello and Barbora Micenkov{\'a} and Erich Schubert and I. Assent and M. E. Houle},
  journal={Data Mining and Knowledge Discovery},
  • G. O. Campos, A. Zimek, +5 authors M. E. Houle
  • Published 2015
  • Computer Science
  • Data Mining and Knowledge Discovery
  • The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    A local density-based approach for outlier detection
    • 71
    • PDF
    On the Evaluation of Outlier Detection and One-Class Classification Methods
    • 28
    • PDF
    Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection
    • 31
    • Highly Influenced
    • PDF
    Contextual Outlier Interpretation
    • 17
    • PDF
    On normalization and algorithm selection for unsupervised outlier detection
    • 8
    • Highly Influenced
    • PDF
    Unsupervised Feature Selection for Outlier Detection by Modelling Hierarchical Value-Feature Couplings
    • 21
    • PDF
    Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection
    • 9
    • PDF
    Generative Adversarial Active Learning for Unsupervised Outlier Detection
    • 31
    • PDF


    Publications referenced by this paper.
    LOF: identifying density-based local outliers
    • 3,133
    • PDF
    Algorithms for Mining Distance-Based Outliers in Large Datasets
    • 1,575
    • PDF
    Efficient algorithms for mining outliers from large data sets
    • 1,319
    • Highly Influential
    • PDF
    Outlier detection by active learning
    • 270
    • Highly Influential
    Angle-based outlier detection in high-dimensional data
    • 454
    • PDF
    A survey on unsupervised outlier detection in high-dimensional numerical data
    • 444
    Feature bagging for outlier detection
    • 439
    • PDF
    Outlier Analysis
    • 772
    • PDF