Stream Classification

@inproceedings{Stefanowski2017StreamC,
  title={Stream Classification},
  author={Jerzy Stefanowski and Dariusz W. Brzezinski},
  booktitle={Encyclopedia of Machine Learning and Data Mining},
  year={2017}
}
Definition Stream classification is a variant of incremental learning of classifiers that has to satisfy requirements specific for massive streams of data: restrictive processing time, limited memory, and one scan of incoming examples. Additionally, stream classifiers often have to be adaptive, as they usually act in dynamic, non-stationary environments where data and target concepts can change over time. To fulfill these requirements new solutions include dedicated data management and… 
Imbalanced Data Stream Classification Using Hybrid Data Preprocessing
TLDR
A novel algorithm that combines both over- and under-sampling techniques in order to create a more robust classifier dedicated to imbalanced data streams is proposed.
FERNN: A Fast and Evolving Recurrent Neural Network Model for Streaming Data Classification
TLDR
A novel variant of RNN, termed as FERNN, is proposed, which features single-pass learning capability along with self-evolution property, and is free from normal distribution assumption for streaming data, making it more flexible.
Enhancements for Sliding Window based Stream Classification
TLDR
New enhancements on sliding window based classification methods that use the traditional kNN (K-Nearest Neighbors) method in a sliding window and include the mean of the previous instances as a nearest neighbor instance and generates an ensemble classifier.
Statistical Tests Ensemble Drift Detector
TLDR
A concept drift detector aiming to validate empirically the idea of implementing a drift detection method based on the combination of statistical tests as a viable option to improve the classification.
Mining Techniques for Streaming Data
TLDR
A model for mining the streaming data is presented which describes the main phases of data stream manipulation, and the methods for data stream summarizing and creating synopsis are reviewed.
Are We Overfitting to Experimental Setups in Recognition
TLDR
A new framework is constructed, FLUID, which removes certain assumptions made by current experimental setups while integrating these sub-tasks via the following design choices -- consuming sequential data, allowing for flexible training phases, being compute aware, and working in an open-world setting.
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
  • Mateusz Lango
  • Computer Science
    Foundations of Computing and Decision Sciences
  • 2019
TLDR
An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets is carried out in order to analyze the usefulness of im balanced learning methods for sentiment classification and investigate the impact of class imbalance on sentiment corpora.
Real-Time Emotion Classification Using EEG Data Stream in E-Learning Contexts
TLDR
A real-time emotion classification system (RECS)-based Logistic Regression (LR) trained in an online fashion using the Stochastic Gradient Descent (SGD) algorithm that can effectively classify emotions in real- time from the EEG data stream, which achieved better accuracy and F1-score than other offline and online approaches.
From Offline to Real-Time Distributed Activity Recognition in Wireless Sensor Networks for Healthcare: A Review
TLDR
The state of the art and a global overview of research challenges of real-time distributed activity recognition in the field of healthcare are presented and six main angles of optimization are defined: Processing, memory, communication, energy, time, and accuracy.
Modeling and Prediction of Daily Traffic Patterns—WASK and SIX Case Study
TLDR
The efficiency of two forecasting approaches differs with datasets–modeling-based methods achieved lower errors for SIX while machine learning-based for WASK while forecasting for WasK turned out extremely challenging.
...
...

References

SHOWING 1-10 OF 23 REFERENCES
Active Learning with Evolving Streaming Data
TLDR
This paper develops two active learning strategies for streaming data that explicitly handle concept drift, based on uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space.
Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm
TLDR
A new data stream classifier, called the Accuracy Updated Ensemble (AUE2), which aims at reacting equally well to different types of drift, and combines accuracy-based weighting mechanisms known from block-based ensembles with the incremental nature of Hoeffding Trees.
Mining concept-drifting data streams using ensemble classifiers
TLDR
This paper proposes a general framework for mining concept-drifting data streams using weighted ensemble classifiers, and shows that the proposed methods have substantial advantage over single-classifier approaches in prediction accuracy, and the ensemble framework is effective for a variety of classification models.
A survey on concept drift adaptation
TLDR
The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art and aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Incremental Rule-Based Learners for Handling Concept Drift: An Overview
TLDR
This paper reviews incremental rule-based learners designed for changing environments and describes four of the proposed algorithms: FLORA, AQ11-PM+WAH, FACIL and VFDR.
Mining Recurring Concepts in a Dynamic Feature Space
TLDR
MReC-DFS is a data stream classification system to address the challenges of learning recurring concepts in a dynamic feature space while simultaneously reducing the memory cost associated with storing past models and an incremental feature selection method that dynamically determines the threshold between relevant and irrelevant features.
A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data
TLDR
Empirical evaluation on both synthetic data and real botnet traffic reveals that this approach, using only a small amount of labeled data for training, outperforms state-of-the-art stream classification algorithms that use twenty times more labeled data than this approach.
An Overview of Concept Drift Applications
TLDR
This chapter provides an application oriented view towards concept drift research, with a focus on supervised learning tasks, and constructs a reference framework for positioning application tasks within a spectrum of problems related to concept drift.
Very fast decision rules for classification in data streams
TLDR
The adaptive extension (AVFDR) to detect changes in the process generating data and adapt the decision model and the experimental evaluation demonstrates that algorithms achieve competitive results in comparison to alternative methods and the adaptive methods are able to learn fast and compact rule sets from evolving streams.
Learning from Time-Changing Data with Adaptive Windowing
TLDR
A new approach for dealing with distribution change and concept drift when learning from data sequences that may vary with time is presented, using sliding windows whose size is recomputed online according to the rate of change observed from the data in the window itself.
...
...