A Survey of Outlier Detection Methodologies

  title={A Survey of Outlier Detection Methodologies},
  author={V. Hodge and J. Austin},
  journal={Artificial Intelligence Review},
  • V. Hodge, J. Austin
  • Published 2004
  • Computer Science
  • Artificial Intelligence Review
  • Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as… CONTINUE READING

    Figures and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    Introduction to machine learning
    • 3,675
    • PDF
    A Survey of Outlier Detection Methods in Network Anomaly Identification
    • 162
    • PDF
    Outlier Detection Techniques for Wireless Sensor Networks: A Survey
    • 645
    • Highly Influenced
    • PDF
    Anomaly detection: A survey
    • 6,590
    • PDF
    Network Anomaly Detection: Methods, Systems and Tools
    • 667
    • PDF
    Outlier Analysis
    • 772
    • PDF
    Identifying the signs of fraudulent accounts using data mining techniques
    • 400
    • PDF


    Publications referenced by this paper.
    A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
    • 13,927
    • Highly Influential
    • PDF
    C4.5: Programs for Machine Learning
    • 20,153
    Outliers in Statistical Data
    • 3,353
    • Highly Influential
    Outlier detection for high dimensional data
    • 848
    • Highly Influential
    • PDF
    Procedures for Detecting Outlying Observations in Samples
    • 3,027
    Robust regression and outlier detection
    • 4,927
    • Highly Influential
    Algorithms for Mining Distance-Based Outliers in Large Datasets
    • 1,576
    • PDF
    Classification and Regression Trees
    • 27,188
    • PDF
    BIRCH: an efficient data clustering method for very large databases
    • 4,464
    • Highly Influential
    • PDF
    Efficient algorithms for mining outliers from large data sets
    • 1,320
    • PDF