Unobserved classes and extra variables in high-dimensional discriminant analysis

  title={Unobserved classes and extra variables in high-dimensional discriminant analysis},
  author={Michael Fop and Pierre-Alexandre Mattei and Charles Bouveyron and Thomas Brendan Murphy},
  journal={Advances in Data Analysis and Classification},
  pages={55 - 92}
In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach… 

Anomaly and Novelty detection for robust semi-supervised learning

The present work introduces a robust and adaptive Discriminant Analysis rule, capable of handling situations in which one or more of the aforementioned problems occur.

Variational Inference for Semiparametric Bayesian Novelty Detection in Large Datasets

This paper focuses on a two-stage Bayesian semiparametric novelty detector, also known as Brand, recently introduced in the literature, and proposes to resort to a variational Bayes approach, providing an e-cient algorithm for posterior approximation.

A two-stage Bayesian semiparametric model for novelty detection with robust prior information

A two-stage Bayesian semiparametric novelty detector is proposed, building upon prior information robustly extracted from a set of complete learning units, and a general-purpose multivariate methodology is devised that also extends to handle functional data objects.

ACTIVITY REPORT Project-Team Models and Algorithms for Artificial Intelligence

  • Computer Science
  • 2022
A new stochastic approximation version of the EM in which the participants do not sample from the exact distribution in the expectation phase of the procedure is proposed, to prove the convergence of this algorithm toward critical points of the observed likelihood and to favor convergence toward global maxima.



Adaptive Mixture Discriminant Analysis for Supervised Learning with Unobserved Classes

This work introduces a model-based discriminant analysis method, called adaptive mixture discriminantAnalysis (AMDA), which can detect several unobserved groups of points and can adapt the learned classifier to the new situation.

Discriminant analysis

This article focuses on the form of classification known as supervised classification or discriminant analysis, applicable in situations where there are data of known origin with respect to the predefined classes from which a classifier can be constructed to assign an unclassified entity to one of these classes.

A Mixture Model and EM-Based Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets

A novel mixture model is proposed which treats as observed data not only the feature vector and the class label, but also the fact of label presence/absence for each sample, to address problems involving both the known, and unknown classes.

General sparse multi-class linear discriminant analysis

Sparse Discriminant Analysis

This work proposes sparse discriminantAnalysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously in the high-dimensional setting.

Regularized Discriminant Analysis

Alternatives to the usual maximum likelihood estimates for the covariance matrices are proposed, characterized by two parameters, the values of which are customized to individual situations by jointly minimizing a sample-based estimate of future misclassification risk.

Class Discovery in Galaxy Classification

The question of new class discovery in mixed labeled/unlabeled data was formally posed, with a proposed solution based on mixture models and up to a 57% reduction in classification error compared to a standard neural network classifier that uses only labeled data.

A Direct Approach for Sparse Quadratic Discriminant Analysis

A novel procedure named QUDA is proposed for QDA in analyzing high-dimensional data that aims to directly estimate the key quantities in the Bayes discriminant function including quadratic interactions and a linear index of the variables for classification.