# Feature selection in omics prediction problems using cat scores and false nondiscovery rate control

@article{Ahdesmaki2010FeatureSI, title={Feature selection in omics prediction problems using cat scores and false nondiscovery rate control}, author={Miika J. Ahdesmaki and Korbinian Strimmer}, journal={The Annals of Applied Statistics}, year={2010}, volume={4}, pages={503-519} }

We revisit the problem of feature selection in linear discriminant analysis (LDA), that is, when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobis-transformed predictors are given by correlation-adjusted $t$-scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false nondiscovery rates (FNDR). Third, training of the classifier is based…

## 136 Citations

High-Dimensional Regression and Variable Selection Using CAR Scores

- Computer Science
- 2011

The CAR score is introduced, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables that provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables.

Stability of feature selection in classification issues for high-dimensional correlated data

- Computer ScienceStat. Comput.
- 2016

The present paper highlights the impact of dependence in terms of instability of feature selection and revisits the above issue using a flexible factor modeling for the covariance in the classical linear discriminant analysis (LDA) framework.

Effect Size Estimation and Misclassification Rate Based Variable Selection in Linear Discriminant Analysis

- Business
- 2012

This article explores how an assessment of the individual importance of variables (effect size estimation) can be used to perform variable selection and proposes a new conceptually simple effect size estimation method which is at the same time computationally efficient.

Variational discriminant analysis with variable selection

- Computer ScienceStat. Comput.
- 2020

A fast Bayesian method that seamlessly fuses classification and hypothesis testing via discriminant analysis is developed and reverse collapsed variational Bayes gives rise to variable selection that can be directly posed as a multiple hypothesis testing approach using likelihood ratio statistics.

COMPARISON OF THE GAUSSIAN PROCESS CLASSIFICATION APPROACH WITH CLASSICAL PENALISED CLASSIFICATION METHODS IN HIGH DIMENSIONAL OMICS DATA

- Computer Science
- 2009

In this study we investigate (linear) Gaussian process (GP) priors for Bayesian classification comparing with other more classical penalised classification methods. For training the GP classifier we…

Gene ranking and biomarker discovery under correlation

- BiologyBioinform.
- 2009

A simple procedure is proposed that adjusts gene-wise t-statistics to take account of correlations among genes and improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa.

An interpretable regression approach based on bi-sparse optimization

- Computer ScienceApplied Intelligence
- 2020

A bi-sparse optimization-based regression model and corresponding algorithm with reconstructed row and column kernel matrices in the framework of support vector regression (SVR) that significantly outperformed the other six regression models in predictive accuracy, identification of the fewest representative instances, selection of theFewest important features, and interpretability of results.

Signal identification for rare and weak features: higher criticism or false discovery rates?

- Computer ScienceBiostatistics
- 2013

It is demonstrated that in a rare-weak setting in the region of the phase space where signal identification is possible, both thresholds are practicably indistinguishable, and thus HC thresholding is identical to using a simple local FDR cutoff.

An Introduction to Feature Selection

- Computer Science
- 2013

This chapter demonstrates the negative effect of extra predictors on a number of models, as well as discussing typical approaches to supervised feature selection such as wrapper and filter methods and the danger of selection bias.

## References

SHOWING 1-10 OF 35 REFERENCES

Covariance‐regularized regression and classification for high dimensional problems

- Mathematics, Computer ScienceJournal of the Royal Statistical Society. Series B, Statistical methodology
- 2009

It is shown that ridge regression, the lasso and the elastic net are special cases of covariance‐regularized regression, and it is demonstrated that certain previously unexplored forms of covariant regularized regression can outperform existing methods in a range of situations.

Regularized Discriminant Analysis and Its Application in Microarrays

- Computer Science
- 2004

These SCRDA methods generalize the idea of the nearest shrunken centroids of Prediction Analysis of Microarray into the classical discriminant analysis and perform uniformly well in the multivariate classification problems, especially outperform the currently popular PAM.

Regularized linear discriminant analysis and its application in microarrays.

- Computer ScienceBiostatistics
- 2007

Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method and can be as competitive as the support vector machines classifiers.

Higher criticism thresholding: Optimal feature selection when useful features are rare and weak

- Computer ScienceProceedings of the National Academy of Sciences
- 2008

In the most challenging RW settings, HCT uses an unconventionally low threshold, which keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance.

High Dimensional Classification Using Features Annealed Independence Rules.

- Computer ScienceAnnals of statistics
- 2008

The conditions under which all the important features can be selected by the two-sample t-statistic are established and the choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error.

Empirical Bayes Estimates for Large-Scale Prediction Problems

- MathematicsJournal of the American Statistical Association
- 2009

An empirical Bayes approach to large-scale prediction, where the optimum Bayes prediction rule is estimated employing the data from all of the predictors, is proposed.

Gene ranking and biomarker discovery under correlation

- BiologyBioinform.
- 2009

A simple procedure is proposed that adjusts gene-wise t-statistics to take account of correlations among genes and improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa.

Optimality Driven Nearest Centroid Classification from Genomic Data

- Computer SciencePloS one
- 2007

This work introduces a new feature selection approach for high-dimensional nearest centroid classifiers that is based on the theoretically optimal choice of a given number of features, and applies it to clinical classification based on gene-expression microarrays, demonstrating that the proposed method can outperform existing nearest centroids.

Modified linear discriminant analysis approaches for classification of high-dimensional microarray data

- Computer ScienceComput. Stat. Data Anal.
- 2009

Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach

- Computer ScienceStatistical applications in genetics and molecular biology
- 2007

The shrinkage t statistic is introduced, a novel and model-free shrinkage estimate of the variance vector across genes that is derived in a quasi-empirical Bayes setting and consistently leads to highly accurate rankings.