Robust change detection for large-scale data streams

  title={Robust change detection for large-scale data streams},
  author={Ruizhi Zhang and Yajun Mei and JianJun Shi},
  journal={Sequential Analysis},
  pages={1 - 19}
Abstract Robust change point detection for large-scale data streams has many real-world applications in industrial quality control, signal detection, and biosurveillance. Unfortunately, it is highly nontrivial to develop efficient schemes due to three challenges: (1) the unknown sparse subset of affected data streams, (2) the unexpected outliers, and (3) computational scalability for real-time monitoring and detection. In this article, we develop a family of efficient real-time robust detection… 


Scalable SUM-Shrinkage Schemes for Distributed Monitoring Large-Scale Data Streams
This article proposes to develop scalable global monitoring schemes by parallel running local detection procedures and by combining these local procedures together to make a global decision based on SUM-shrinkage techniques.
Optimal sequential detection in multi-stream data
  • H. Chan
  • Computer Science, Mathematics
  • 2015
This work shows how the (optimal) detection delay depends on the fraction of data streams undergoing distribution changes as the number of detectors goes to infinity, and shows that the optimal detection delay is achieved by the sum of detectability score transformations of either the partial scores or CUSUM scores of the data streams.
Sequential multi-sensor change-point detection
  • Yao Xie, D. Siegmund
  • Mathematics
    2013 Information Theory and Applications Workshop (ITA)
  • 2013
We develop a mixture procedure to monitor parallel streams of data for a change-point that affects only a subset of them, without assuming a spatial structure relating the data streams to one
Asymptotic Statistical Properties of Communication-Efficient Quickest Detection Schemes in Sensor Networks
This work develops scalable communication-efficient schemes based on the sum of those local CUSUM statistics that are “large” under either hard, soft, or order thresholding rules and establishes their asymptotic statistical properties under two regimes.
An online kernel change detection algorithm
This work considers the case where no accurate nor tractable model can be found, using a model-free approach, called Kernel change detection (KCD), and builds a dissimilarity measure in feature space between two sets of descriptors, shown to be asymptotically equivalent to the Fisher ratio in the Gaussian case.
Efficient Computer Network Anomaly Detection by Changepoint Detection Methods
A novel score-based multi-cyclic detection algorithm based on the Shiryaev-Roberts procedure, which is as easy to employ in practice and as computationally inexpensive as the popular Cumulative Sum chart and the Exponentially Weighted Moving Average scheme is proposed.
Minimax Robust Quickest Change Detection
A robust version of these quickest change detection problems is considered when the pre-change and post-change distributions are not known exactly but belong to known uncertainty classes of distributions, such that the detection rule designed for the LFDs is optimal for the robust problem in a minimax sense.
Statistical Challenges Facing Early Outbreak Detection in Biosurveillance
This work focuses mainly on the monitoring of time series to provide early alerts of anomalies to stimulate investigation of potential outbreaks, with a brief summary of methods to detect significant spatial and spatiotemporal case clusters.
Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems
This paper presents asymptotically optimal decentralized quickest change detection procedures for two scenarios, and considers the minimax, uniform, and Bayesian versions of the optimization problem, and presents simulation results for examples involving Gaussian and Poisson observations.
A Robust Surveillance Scheme for Stochastically Ordered Alternatives
We sequentially observe independent observations X 1 , X 2 such that initially they have distribution G 0 ; at some unknown time v they become stochastically larger, having distribution G 1 . Neither