Learning from Time-Changing Data with Adaptive Windowing

@inproceedings{Bifet2007LearningFT,
  title={Learning from Time-Changing Data with Adaptive Windowing},
  author={Albert Bifet and Ricard Gavald{\`a}},
  booktitle={SDM},
  year={2007}
}
We present a new approach for dealing with distribution change and concept drift when learning from data sequences that may vary with time. We use sliding windows whose size, instead of being fixed a priori, is recomputed online according to the rate of change observed from the data in the window itself: The window will grow automatically when the data is stationary, for greater accuracy, and will shrink automatically when change is taking place, to discard stale data. This delivers the user or… 

Figures and Tables from this paper

Adaptive parameter-free learning from evolving data streams

A method for developing algorithms that can adaptively learn from data streams that change over time, based on using change detectors and estimator modules at the right places and choosing implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm.

Adaptive Learning from Evolving Data Streams

A method for developing algorithms that can adaptively learn from data streams that drift over time, based on using change detectors and estimator modules at the right places and choosing implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm.

Learning Decision Trees Adaptively from Data Streams with Time Drift

A new algorithm based on Hulten-SpencerDomingos’s CVFDT that overcomes some of the shortcomings of CVF DT, specifically, dependence on user-entered parameters that determine the guessed speed of change.

Learning from Data Streams: Synopsis and Change Detection

This work presented a feasible approach, using incremental histograms and monitoring data distributions, to detect concept drift in data stream context.

Sequential Change Detection on Data Streams

This work adopts the sound statistical method of sequential hypothesis testing to study the problem of detecting when there is a change in the input stream, which makes models stale and inaccurate.

Adaptive learning and mining for data streams and frequent patterns

This thesis proposes and illustrates a framework for developing algorithms that can adaptively learn from data streams that change over time, and introduces a general methodology to identify closed patterns in a data stream, using Galois Lattice Theory.

Detecting changes in streaming data with information-theoretic windowing

This study proposes a novel method of detecting changes via data compression, by introducing minimum description length (MDL) change statistics to an adaptive windowing regime, and introduces the notion of asymptotic reliability as a criterion of change point detection algorithms.

Remember the Good, Forget the Bad, do it Fast - Continuous Learning over Streaming Data

This work considers adaptive learning algorithms for the analysis of continuously evolving network data streams, using a dynamic, variable length system memory which automatically adapts to concept drifts in the underlying data.

Efficient handling of concept drift and concept evolution over Stream Data

This paper presents an efficient framework, which is based on the same principle as SAND, but exploits dynamic programming and executes the change detection module selectively, and provides theoretical justification of the confidence calculation, and shows effect of a concept drift on subsequent confidence scores.

Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds

Two main approaches to handle concept drift regardless of the learning model are proposed, the first one involves moving averages and is more suitable to detect abrupt changes and the second follows a widespread intuitive idea to deal with gradual changes using weighted moving averages.
...

References

SHOWING 1-10 OF 18 REFERENCES

Detecting Concept Drift with Support Vector Machines

A new method to recognize and handle concept changes with support vector machines that maintains a window on the training data and can eeectively select an appropriate window size in a robust way is proposed.

Sampling from a moving window over streaming data

This work introduces the problem of sampling from a moving window of recent items from a data stream and develops two algorithms, the first of which, "chain-sample", extends reservoir sampling to deal with the expiration of data elements from the sample and the second, "priority- sample", works even when the number of elements in the window can vary dynamically over time.

Learning Changing Concepts by Exploiting the Structure of Change

Using a deterministic analysis in a general metric space setting, this paper provides a technique for constructing a successful prediction algorithm, given a successful estimation algorithm, for the prediction of changing concepts.

Online classification of nonstationary data streams

  • Mark Last
  • Computer Science
    Intell. Data Anal.
  • 2002
OLIN, an online classification system, which dynamically adjusts the size of the training window and the number of new examples between model re-constructions to the current rate of concept drift is described and evaluated.

Learning with Drift Detection

A method for detection of changes in the probability distribution of examples, to control the online error-rate of the algorithm and to observe that the method is independent of the learning algorithm.

Maintaining Stream Statistics over Sliding Windows

The problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far, is considered, and it is shown that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits of memory, the number of 1's can be estimated to within a factor of $1 + \ep silon$.

Learning in the Presence of Concept Drift and Hidden Contexts

A family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear are described, including a heuristic that constantly monitors the system's behavior.

Tracking drifting concepts by minimizing disagreements

This paper shows that if H is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm forH which tolerates drift rates up to a constant times ε2/(d2 ln 1/ε).

Maintaining time-decaying stream aggregates

Surprisingly, even though maintaining decayed aggregates have become a widely-used tool, this work seems to be the first both to explore it formally and to provide storage-efficient algorithms for important families of decay functions, including polynomial decay.

Data streams: algorithms and applications

Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.