Feature selection in omics prediction problems using cat scores and false nondiscovery rate control

@article{Ahdesmaki2010FeatureSI,
  title={Feature selection in omics prediction problems using cat scores and false nondiscovery rate control},
  author={Miika J. Ahdesmaki and Korbinian Strimmer},
  journal={The Annals of Applied Statistics},
  year={2010},
  volume={4},
  pages={503-519}
}
We revisit the problem of feature selection in linear discriminant analysis (LDA), that is, when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobis-transformed predictors are given by correlation-adjusted $t$-scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false nondiscovery rates (FNDR). Third, training of the classifier is based… 

Figures and Tables from this paper

High-Dimensional Regression and Variable Selection Using CAR Scores
TLDR
The CAR score is introduced, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables that provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables.
Stability of feature selection in classification issues for high-dimensional correlated data
TLDR
The present paper highlights the impact of dependence in terms of instability of feature selection and revisits the above issue using a flexible factor modeling for the covariance in the classical linear discriminant analysis (LDA) framework.
Variational nonparametric discriminant analysis
Effect Size Estimation and Misclassification Rate Based Variable Selection in Linear Discriminant Analysis
  • B. Klaus
  • Business
    Journal of Data Science
  • 2021
TLDR
This article explores how an assessment of the individual importance of variables (effect size estimation) can be used to perform variable selection and proposes a new conceptually simple effect size estimation method which is at the same time computationally efficient.
Variational discriminant analysis with variable selection
TLDR
A fast Bayesian method that seamlessly fuses classification and hypothesis testing via discriminant analysis is developed and reverse collapsed variational Bayes gives rise to variable selection that can be directly posed as a multiple hypothesis testing approach using likelihood ratio statistics.
COMPARISON OF THE GAUSSIAN PROCESS CLASSIFICATION APPROACH WITH CLASSICAL PENALISED CLASSIFICATION METHODS IN HIGH DIMENSIONAL OMICS DATA
In this study we investigate (linear) Gaussian process (GP) priors for Bayesian classification comparing with other more classical penalised classification methods. For training the GP classifier we
Gene ranking and biomarker discovery under correlation
TLDR
A simple procedure is proposed that adjusts gene-wise t-statistics to take account of correlations among genes and improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa.
An interpretable regression approach based on bi-sparse optimization
TLDR
A bi-sparse optimization-based regression model and corresponding algorithm with reconstructed row and column kernel matrices in the framework of support vector regression (SVR) that significantly outperformed the other six regression models in predictive accuracy, identification of the fewest representative instances, selection of theFewest important features, and interpretability of results.
An Introduction to Feature Selection
TLDR
This chapter demonstrates the negative effect of extra predictors on a number of models, as well as discussing typical approaches to supervised feature selection such as wrapper and filter methods and the danger of selection bias.
Signal identification for rare and weak features: higher criticism or false discovery rates?
TLDR
It is demonstrated that in a rare-weak setting in the region of the phase space where signal identification is possible, both thresholds are practicably indistinguishable, and thus HC thresholding is identical to using a simple local FDR cutoff.
...
...

References

SHOWING 1-10 OF 35 REFERENCES
Covariance‐regularized regression and classification for high dimensional problems
  • D. Witten, R. Tibshirani
  • Mathematics, Computer Science
    Journal of the Royal Statistical Society. Series B, Statistical methodology
  • 2009
TLDR
It is shown that ridge regression, the lasso and the elastic net are special cases of covariance‐regularized regression, and it is demonstrated that certain previously unexplored forms of covariant regularized regression can outperform existing methods in a range of situations.
Regularized Discriminant Analysis and Its Application in Microarrays
TLDR
These SCRDA methods generalize the idea of the nearest shrunken centroids of Prediction Analysis of Microarray into the classical discriminant analysis and perform uniformly well in the multivariate classification problems, especially outperform the currently popular PAM.
Regularized linear discriminant analysis and its application in microarrays.
TLDR
Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method and can be as competitive as the support vector machines classifiers.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
TLDR
In the most challenging RW settings, HCT uses an unconventionally low threshold, which keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance.
High Dimensional Classification Using Features Annealed Independence Rules.
TLDR
The conditions under which all the important features can be selected by the two-sample t-statistic are established and the choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error.
Empirical Bayes Estimates for Large-Scale Prediction Problems
  • B. Efron
  • Mathematics
    Journal of the American Statistical Association
  • 2009
TLDR
An empirical Bayes approach to large-scale prediction, where the optimum Bayes prediction rule is estimated employing the data from all of the predictors, is proposed.
Gene ranking and biomarker discovery under correlation
TLDR
A simple procedure is proposed that adjusts gene-wise t-statistics to take account of correlations among genes and improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa.
Optimality Driven Nearest Centroid Classification from Genomic Data
TLDR
This work introduces a new feature selection approach for high-dimensional nearest centroid classifiers that is based on the theoretically optimal choice of a given number of features, and applies it to clinical classification based on gene-expression microarrays, demonstrating that the proposed method can outperform existing nearest centroids.
Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach
TLDR
The “shrinkage t” statistic is introduced, a novel and model-free shrinkage estimate of the variance vector across genes that is derived in a quasi-empirical Bayes setting and consistently leads to highly accurate rankings.
...
...