A unifying framework for detecting outliers and change points from non-stationary time series data

  title={A unifying framework for detecting outliers and change points from non-stationary time series data},
  author={Kenji Yamanishi and Jun’ichi Takeuchi},
  journal={Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining},
  • K. YamanishiJ. Takeuchi
  • Published 23 July 2002
  • Computer Science
  • Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
We are concerned with the issues of outlier detection and change point detection from a data stream. [] Key Method In this framework a probabilistic model of the data source is incrementally learned using an on-line discounting learning algorithm, which can track the changing data source adaptively by forgetting the effect of past data gradually.

Figures from this paper

A unifying framework for detecting outliers and change points from time series

This paper presents a unifying framework for dealing with outlier detection and change point detection, which is incrementally learned using an online discounting learning algorithm and compared with conventional methods to demonstrate its validity through simulation and experimental applications to incidents detection in network security.

Coding of Non-Stationary Sources as a Foundation for Detecting Change Points and Outliers in Binary Time-Series

Theoretical foundations are provided for the use of adaptive estimation procedures for estimating and adapting distributions in real-time for non-stationary data using an adaptive version of the Context Tree Weighting compression algorithm.

Change-Point Detection with Feature Selection in High-Dimensional Time-Series Data

A supervised learning based change-point detection approach in which the separability of past and future data at time t is used as plausibility of change-points and a detection measure called the additive Hilbert-Schmidt Independence Criterion (aHSIC) is proposed.

Online Conditional Outlier Detection in Nonstationary Time Series

This work proposes a new two-layer outlier detection approach that first tries to model and account for the nonstationarity and periodic variation in the time series, and then tries to use other observable variables in the environment to explain any additional signal variation.

A Novel Framework for Context-aware Outlier Detection in Big Data Streams

This paper proposes a novel framework for contextual outlier detection in big data streams which inject the contextual attributes in the stream content as a primary input for outlier Detection rather than using the streamcontent alone or applying the contextual detection on content anomalies only.

Outlier Detection for Temporal Data

This book presents a wide collection of applications where temporal outlier detection techniques have been applied to discover interesting outliers, and lists down a taxonomy of proposed techniques for temporalOutlier detection.

Incremental Local Outlier Detection for Data Streams

The paper provides theoretical evidence that insertion of a new data point as well as deletion of an old data point influence only limited number of their closest neighbors and thus the number of updates per such insertion/deletion does not depend on the total number of points in the data set.

Change-Point Detection in Time-Series Data by Direct Density-Ratio Estimation

This paper provides a change-point detection algorithm based on direct density-ratio estimation that can be computed very efficiently in an online manner and allows for non-parametric density estimation, which is known to be a difficult problem.

Anomaly Detection over Concept Drifting Data Streams

A hybrid framework by combining LOF (Local outlier Factor) and BPNN (Back propagation Neural Network), appropriate for detecting outliers in data streams, is proposed and provides equivalent detection performance as the iterated static LOF algorithm, while requiring significantly less computational time.



On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms

An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.

Event detection from time series data

An iterative algorithm is proposed that fits a model to a time segment, and uses a likelihood criterion to determine if the segment should be partitioned further, i.e. if it contains a new changepoint.

Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner

Applying of this framework to the network intrusion detection, it is demonstrated that it can significantly improve the accuracy of SmartSifter, and outlier filtering rules can help the user to discover a general pattern of an outlier group.

Algorithms for Mining Distance-Based Outliers in Large Datasets

This paper provides formal and empirical evidence showing the usefulness of DB-outliers and presents two simple algorithms for computing such outliers, both having a complexity of O(k N’), k being the dimensionality and N being the number of objects in the dataset.

Detecting Cellular Fraud Using Adaptive Prototypes.

Using a recurrent neural network technique, prototypes are uniformly distributed over Toll Tickets to form statistical behaviour proFdes covering both the short and long-term past to be prepared for the would-be fraudster for both GSM and UMTS.

Unsupervised Profiling for Identifying Superimposed Fraud

This paper presents a comprehensive representation of “customer behavior” and discusses issues derived from it: a distance function and a clustering algorithm for probability distributions.

Smoothness priors analysis of time series

1 Introduction.- 1.1 Background.- 1.2 What is in the Book.- 1.3 Time Series Examples.- 2 Modeling Concepts and Methods.- 2.1 Akaike's AIC: Evaluating Parametric Models.- 2.1.1 The Kullback-Leibler

Activity monitoring: noticing interesting changes in behavior

It is shown that two superficially different tasks, news story monitoring and intrusion detection, can be expressed naturally within the framework, and show that key differences in solution methods can be compared.

A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants

An incremental variant of the EM algorithm in which the distribution for only one of the unobserved variables is recalculated in each E step is shown empirically to give faster convergence in a mixture estimation problem.


Two methods of fitting piecewise multiple regression models are presented. One, based on dynamic programming, yields maximum‐likelihood estimators and is suitable for sequences of moderate length. A