A convex optimization approach to high-dimensional sparse quadratic discriminant analysis

  title={A convex optimization approach to high-dimensional sparse quadratic discriminant analysis},
  author={T. Tony Cai and Linjun Zhang},
  journal={arXiv: Methodology},
In this paper, we study high-dimensional sparse Quadratic Discriminant Analysis (QDA) and aim to establish the optimal convergence rates for the classification error. Minimax lower bounds are established to demonstrate the necessity of structural assumptions such as sparsity conditions on the discriminating direction and differential graph for the possible construction of consistent high-dimensional QDA rules. We then propose a classification algorithm called SDAR using constrained convex… 

Tables from this paper

Robust Generalised Quadratic Discriminant Analysis
The present paper investigates the performance of the GQDA classifier when the classical estimators of the mean vector and the dispersion matrix used therein are replaced by various robust counterparts.
Optimal Classification for Functional Data
A central topic in functional data analysis is how to design an optimal decision rule, based on training samples, to classify a data function. We exploit the optimal classification problem when data
Non-splitting Neyman-Pearson Classifiers
The Neyman-Pearson (NP) binary classification paradigm constrains the more severe type of error (e.g., the type I error) under a preferred level while minimizing the other (e.g., the type II error).
Optimal Imperfect Classification for Gaussian Functional Data
Existing works on functional data classification focus on the construction of classifiers that achieve perfect classification in the sense that classification risk converges to zero asymptotically.
A fast iterative algorithm for high-dimensional differential network
A fast iterative algorithm to recover the differential network for high-dimensional data with a small sample size is introduced and it is shown that the proposed algorithm outperforms other existing methods.
Simultaneous differential network analysis and classification for matrix-variate data with application to brain connectivity.
An ensemble-learning procedure is developed, which identifies the differential interaction patterns of brain regions between the case group and the control group and conducts medical diagnosis (classification) of the disease simultaneously and satisfactory out-of-sample classification performance is achieved for medical diagnosis of AD.


High dimensional linear discriminant analysis: optimality, adaptive algorithm and missing data
  • T. Cai, Linjun Zhang
  • Mathematics, Computer Science
    Journal of the Royal Statistical Society: Series B (Statistical Methodology)
  • 2019
A data-driven and tuning free classification rule, which is based on an adaptive constrained $\ell_1$ minimization approach, is proposed and analyzed and it is shown to be simultaneously rate optimal over a collection of parameter spaces.
A Direct Estimation Approach to Sparse Linear Discriminant Analysis
This article considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix Ω and the
A Direct Approach for Sparse Quadratic Discriminant Analysis
A novel procedure named QUDA is proposed for QDA in analyzing high-dimensional data that aims to directly estimate the key quantities in the Bayes discriminant function including quadratic interactions and a linear index of the variables for classification.
A direct approach to sparse discriminant analysis in ultra-high dimensions
Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have
Many contemporary studies involve the classification of a subject into two classes based on n observations of the p variables associated with the subject. Under the assumption that the variables are
CHIME: Clustering of high-dimensional Gaussian mixtures with EM algorithm and its optimality
Unsupervised learning is an important problem in statistics and machine learning with a wide range of applications. In this paper, we study clustering of high-dimensional Gaussian mixtures and
Sparse semiparametric discriminant analysis
High-dimensional sparse semiparametric discriminant analysis (SSDA) is developed that generalizes the normal-theory discriminantAnalysis in two ways: it relaxes the Gaussian assumptions and can handle ultra-high dimensional classification problems.
CODA: high dimensional copula discriminant analysis
In high dimensional settings, it is proved that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk.
A Model of Double Descent for High-dimensional Binary Linear Classification
A model for logistic regression where only a subset of features of size p is used for training a linear classifier over n training samples is considered, and a phase-transition phenomenon for the case of Gaussian regressors is uncovered.
Regularized rank-based estimation of high-dimensional nonparanormal graphical models
A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes