• Corpus ID: 5469265

A Bayesian Ensemble for Unsupervised Anomaly Detection

@article{Yu2016ABE,
  title={A Bayesian Ensemble for Unsupervised Anomaly Detection},
  author={Edward Yu and Parth Parekh},
  journal={ArXiv},
  year={2016},
  volume={abs/1610.07677}
}
Methods for unsupervised anomaly detection suffer from the fact that the data is unlabeled, making it difficult to assess the optimality of detection algorithms. Ensemble learning has shown exceptional results in classification and clustering problems, but has not seen as much research in the context of outlier detection. Existing methods focus on combining output scores of individual detectors, but this leads to outputs that are not easily interpretable. In this paper, we introduce a… 

Figures from this paper

Anomaly Detection in Multivariate Time Series Using Fuzzy AdaBoost and Dynamic Naive Bayesian Classifier
TLDR
This paper presents a novel method to detect anomaly using Fuzzy AdaBoost and Dynamic Naive Bayesian classifier and introduces an approach to initialize the intervals of FOU using the statistical assets of data that belongs to the normal class.
Combining OC-SVMs With LSTM for Detecting Anomalies in Telemetry Data With Irregular Intervals
TLDR
A novel anomaly detection method, Detecting Anomalies using LSTM and Ensembled One-Class Support Vector Machines (DALEO) is proposed to further improve the performance of anomaly detection.
Detecting and Predicting Anomalies for Edge Cluster Environments using Hidden Markov Models
  • Areeg Samir, C. Pahl
  • Computer Science
    2019 Fourth International Conference on Fog and Mobile Edge Computing (FMEC)
  • 2019
TLDR
An anomaly detection and prediction model based on Hidden Markov Model (HMM) that addresses the problem of mapping observations to underlying infrastructure problems in order to optimize system availability and performance is presented.
Unsupervised Real-Time Stream-Based Novelty Detection Technique an Approach in a Corporate Cloud
TLDR
The authors developed an approach to the detection of changepoints based on the specific conditions of the streaming telemetry data to demonstrate the viability of the proposed approach as well as directions for further study and improvement.
Intrusion Detection for IoT Based on Improved Genetic Algorithm and Deep Belief Network
TLDR
The experimental results show that the improved intrusion detection model combined with DBN can effectively improve the recognition rate of intrusion attacks and reduce the complexity of the neural network structure.
Toward Detecting Infection Incidence in People With Type 1 Diabetes Using Self-Recorded Data (Part 1): A Novel Framework for a Personalized Digital Infectious Disease Detection System
TLDR
The results demonstrated that compared with regular or normal days, infection incidence substantially alters the norm of blood glucose dynamics, which are quite significant changes that could possibly be detected through personalized modeling, for example, prediction models and anomaly detection algorithms.

References

SHOWING 1-10 OF 20 REFERENCES
Anomaly Detection Ensembles: In Defense of the Average
  • Alvin Chiang, Yi-Ren Yeh
  • Computer Science
    2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
  • 2015
TLDR
This work investigates the usefulness of building ensembles for the purpose of anomaly detection and finds that currently, to the best of the knowledge, there is no great advantage in using anything more complicated than the simple average over all available outlier scores.
Converting Output Scores from Outlier Detection Algorithms into Probability Estimates
  • Jing Gao, P. Tan
  • Computer Science
    Sixth International Conference on Data Mining (ICDM'06)
  • 2006
TLDR
This paper presents two methods for transforming outlier scores into probabilities that models the score distributions as a mixture of exponential and Gaussian probability functions and calculates the posterior probabilites via the Bayes' rule.
Feature bagging for outlier detection
TLDR
A novel feature bagging approach for detecting outliers in very large, high dimensional and noisy databases is proposed, which combines results from multiple outlier detection algorithms that are applied using different set of features.
Ensembles for unsupervised outlier detection: challenges and research questions a position paper
TLDR
The core ingredients for building an outlier ensemble are focused on, the first steps taken in the literature are discussed, and challenges for future research are identified.
Consensus extraction from heterogeneous detectors to improve performance over network traffic anomaly detection
TLDR
This paper proposes a discriminative model by reaching consensus among multiple atomic anomaly detectors in an unsupervised manner when there are very few or even no known anomalous events for training, and produces a perevent based non-trivial weighted combination of the atomic detectors.
Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs
TLDR
An ensemble approach for anomaly mining that employs novel techniques to automatically and systematically select the results to assemble in a fully unsupervised fashion, which yields superior performance compared to individual detectors alone, the full ensemble, and an existing diversity-based ensemble.
Theoretical Foundations and Algorithms for Outlier Ensembles
TLDR
The theoretical underpinnings of outlier ensemble analysis are investigated and new combination functions that are robust in many settings are proposed, including more effective variants of subsampling and feature bagging.
On Evaluation of Outlier Rankings and Outlier Scores
TLDR
A generalized view of evaluation methods is presented that allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance.
Bayesian Classifier Combination
TLDR
A general framework for Bayesian model combination (which differs from model averaging) in the context of classification is explored, which explicitly models the relationship between each model’s output and the unknown true label.
Interpreting and Unifying Outlier Scores
TLDR
It is shown that a unification of outlier scores provided by various outlier models and a translation of the arbitrary “outlier factors” to values in the range of 0, 1 interpretable as values describing the probability of a data object of being an outlier facilitates enhanced ensembles for outlier detection.
...
...