Explaining anomalies in groups with characterizing subspace rules

@article{Macha2018ExplainingAI,
  title={Explaining anomalies in groups with characterizing subspace rules},
  author={Meghanath Macha and Leman Akoglu},
  journal={Data Mining and Knowledge Discovery},
  year={2018},
  volume={32},
  pages={1444-1480}
}
Anomaly detection has numerous applications and has been studied vastly. We consider a complementary problem that has a much sparser literature: anomaly description. Interpretation of anomalies is crucial for practitioners for sense-making, troubleshooting, and planning actions. To this end, we present a new approach called x-PACS (for eXplaining Patterns of Anomalies with Characterizing Subspaces), which “reverse-engineers” the known anomalies by identifying (1) the groups (or patterns) that… 
A Comparative Evaluation of Anomaly Explanation Algorithms
Detection of anomalies (i.e., outliers) in multi-dimensional data is a well-studied subject in machine learning. Unfortunately, unsupervised detectors provide no explanation about why a data point
Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability
TLDR
The results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, the active learning algorithms under the streaming-data setup are competitive with the batch setup.
Anomaly Mining - Past, Present and Future
TLDR
This article focuses on two areas, (1) point-cloud and graph-based anomaly mining, and aims to present a broad view of each area, and discuss classes of main research problems, recent trends and future directions.
Anomaly explanation with random forests
Anomaly Detection with Machine Learning in the Presence of Extreme Value - A Review Paper
TLDR
A conclusion is provided that discusses about the important of detecting clearly between outlier or extreme value in data pre-processing stage, so that an effective and efficient prediction can be achieved later in modelling stage.
Anomaly Mining: Past, Present and Future
TLDR
While the vast majority of the literature focuses on novel detection algorithms, anomaly mining also concerns various human-centric problems that are beyond mere detection, namely explanation, human interaction, and fairness, which pose many open challenges.
GLAD: GLocalized Anomaly Detection via Human-in-the-Loop Learning
TLDR
A novel human-in-the-loop learning algorithm called GLAD (GLocalized Anomaly Detection) that supports global anomaly detectors and automatically learns their local relevance to specific data instances using label feedback from human analysts.
LP-Explain: Local Pictorial Explanation for Outliers
TLDR
A new outlier explanation approach, called LP-Explain, which tries to identify the set of best Local Pictorial explanations (defined as the scatter plots in the 2-D space of the feature pairs) that can Explain the behavior for cluster of outliers.
A survey on outlier explanations
TLDR
A survey on outlier explanations in which meaningful knowledge is mined from anomalous data to explain them, and the existing outlier explanation techniques are reviewed and how they address the challenges are reviewed.
Detection of Thin Boundaries between Different Types of Anomalies in Outlier Detection Using Enhanced Neural Networks
TLDR
Basic domain-independent methods are introduced to detect these defined anomalies in both unsupervised and supervised datasets and the Multi-Layer Perceptron Neural Network is enhanced using the Genetic Algorithm to detect new defined anomalies with a higher precision.
...
...

References

SHOWING 1-10 OF 80 REFERENCES
Explaining anomalies with Sapling Random Forests
TLDR
An algorithm called Explainer is proposed, which returns the explanation of sample’s differentness in disjunctive normal form (DNF), which is easy to understand by humans and can be applied in many domains to simplify investigation of anomalies.
Interpreting and clustering outliers with sapling random forests
TLDR
A method based on specifically trained decision trees, called sapling random forest, which is able to interpret the output of arbitrary anomaly detector and two methods of clustering anomalies into groups, which can be investigated at once saving time and human efforts.
Discovering Characterizations of the Behavior of Anomalous Subpopulations
TLDR
The notion of exceptional property is introduced and the concept of exceptionality score is defined, which measures the significance of a property, which is designed to be effective for small samples, as it is the case for outliers.
Outlier Ranking via Subspace Analysis in Multiple Views of the Data
TLDR
This work proposes Outrank, a novel outlier ranking concept that exploits subspace analysis to determine the degree of outlierness, and outperforms state-of-the-artoutlierness measures.
Explaining Outliers by Subspace Separability
TLDR
This paper proposes a method that determines possible explanations for the outlier in the form of subspaces in which the given outlier shows separability from the inliers and it also includes a heuristic that gives a substantial speedup over the baseline strategy.
Detecting outlying properties of exceptional objects
TLDR
This article is concerned with the problem of discovering sets of attributes that account for the (a priori stated) abnormality of an individual within a given dataset and proposes efficient algorithms for detecting both global and local forms of most abnormal properties.
EXstream: Explaining Anomalies in Event Stream Monitoring
TLDR
Evaluation using two real-world use cases shows that EXstream can outperform existing techniques significantly in conciseness and consistency while achieving comparable high prediction power and retaining a highly efficient implementation of a data stream system.
Discriminative features for identifying and interpreting outliers
TLDR
An algorithm is proposed that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation.
A Framework for Outlier Description Using Constraint Programming
TLDR
A framework based on constraint programming is proposed to find an optimal subset of features that most differentiates the outliers and normal instances and it is demonstrated the framework offers great flexibility in incorporating diverse scenarios arising in practice such as multiple explanations and human in the loop extensions.
Local Outlier Detection with Interpretation
TLDR
This paper develops an approach that explores the quadratic entropy to adaptively select a set of neighboring instances, and a learning method to seek an optimal subspace in which an outlier is maximally separated from its neighbors.
...
...