Continual Rare-Class Recognition with Emerging Novel Subclasses

  title={Continual Rare-Class Recognition with Emerging Novel Subclasses},
  author={Hung T. Nguyen and Xuejian Wang and Leman Akoglu},
Given a labeled dataset that contains a rare (or minority) class of of-interest instances, as well as a large class of instances that are not of interest, how can we learn to recognize future of-interest instances over a continuous stream? We introduce RaRecognize, which (i) estimates a general decision boundary between the rare and the majority class, (ii) learns to recognize individual rare subclasses that exist within the training data, as well as (iii) flags instances from previously unseen… 

Figures, Tables, and Topics from this paper

Continual Learning for Recurrent Neural Networks: a Review and Empirical Evaluation
Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent
Continual Learning for Recurrent Neural Networks: an Empirical Evaluation


End-to-End Continual Rare-Class Recognition with Emerging Novel Subclasses
Through extensive experiments, it is shown that RaRecognize outperforms state-of-the art baselines on three real-world datasets that contain documents related to corporate-risk and (natural and man-made) disasters as rare classes.
Unseen Class Discovery in Open-world Classification
  • Lei Shu, Hu Xu, Bing Liu
  • Computer Science, Mathematics
  • 2018
A joint open classification model with a sub-model for classifying whether a pair of examples belongs to the same or different classes is proposed that can serve as a distance function for clustering to discover the hidden classes of the rejected examples.
Open-world Learning and Application to Product Classification
A new OWL method based on meta-learning that maintains only a dynamic set of seen classes that allows new classes to be added or deleted with no need for model re-training.
iCaRL: Incremental Classifier and Representation Learning
iCaRL can learn many classes incrementally over a long period of time where other strategies quickly fail, and distinguishes it from earlier works that were fundamentally limited to fixed data representations and therefore incompatible with deep learning architectures.
Lifelong Machine Learning
  • Zhiyuan Chen, B. Liu
  • Computer Science
    Synthesis Lectures on Artificial Intelligence and Machine Learning
  • 2016
As statistical machine learning matures, it is time to make a major effort to break the isolated learning tradition and to study lifelong learning to bring machine learning to new heights.
Overcoming Catastrophic Forgetting by Incremental Moment Matching
IMM incrementally matches the moment of the posterior distribution of the neural network which is trained on the first and the second task, respectively to make the search space of posterior parameter smooth.
Classification Under Streaming Emerging New Classes: A Solution Using Completely-Random Trees
This paper investigates an important problem in stream mining, i.e., classification under streaming emerging new classes or SENC, and proposes an alternative approach by using unsupervised learning as the basis to solve this problem.
Streaming Classification with Emerging New Class by Class Matrix Sketching
The proposed method dynamically maintains two low-dimensional matrix sketches to detect emerging new classes; classify known classes; and update the model in the data stream, which is superior to the existing methods.
DOC: Deep Open Classification of Text Documents
This paper proposes a novel deep learning based approach that outperforms existing state-of-the-art techniques dramatically and is applicable to text learning or text classification.
Distributed Representations of Sentences and Documents
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.