Feature selection in omics prediction problems using cat scores and false nondiscovery rate control

@article{Ahdesmaki2010FeatureSI,
  title={Feature selection in omics prediction problems using cat scores and false nondiscovery rate control},
  author={M. Ahdesmaki and K. Strimmer},
  journal={The Annals of Applied Statistics},
  year={2010},
  volume={4},
  pages={503-519}
}
  • M. Ahdesmaki, K. Strimmer
  • Published 2010
  • Mathematics
  • The Annals of Applied Statistics
  • We revisit the problem of feature selection in linear discriminant analysis (LDA), that is, when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobis-transformed predictors are given by correlation-adjusted $t$-scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false nondiscovery rates (FNDR). Third, training of the classifier is based… CONTINUE READING
    120 Citations

    Figures and Tables from this paper

    High-Dimensional Regression and Variable Selection Using CAR Scores
    • 124
    • PDF
    Stability of feature selection in classification issues for high-dimensional correlated data
    • 18
    • Highly Influenced
    • PDF
    Variational nonparametric discriminant analysis
    • 1
    • PDF
    Effect Size Estimation and Misclassification Rate Based Variable Selection in Linear Discriminant Analysis
    • B. Klaus
    • Mathematics, Computer Science
    • 2012
    • 3
    • Highly Influenced
    • PDF
    Variational discriminant analysis with variable selection
    • 1
    • Highly Influenced
    • PDF
    Gene ranking and biomarker discovery under correlation
    • 92
    • PDF
    An interpretable regression approach based on bi-sparse optimization
    An Introduction to Feature Selection
    • 49
    Signal identification for rare and weak features: higher criticism or false discovery rates?
    • 29
    • PDF

    References

    SHOWING 1-10 OF 35 REFERENCES
    Covariance-regularized regression and classification for high-dimensional problems.
    • D. Witten, R. Tibshirani
    • Mathematics, Medicine
    • Journal of the Royal Statistical Society. Series B, Statistical methodology
    • 2009
    • 195
    • PDF
    Regularized Discriminant Analysis and Its Application in Microarrays
    • 109
    • Highly Influential
    • PDF
    Regularized linear discriminant analysis and its application in microarrays.
    • 475
    • Highly Influential
    • PDF
    Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
    • D. Donoho, J. Jin
    • Mathematics, Medicine
    • Proceedings of the National Academy of Sciences
    • 2008
    • 162
    • Highly Influential
    • PDF
    High Dimensional Classification Using Features Annealed Independence Rules.
    • 442
    • PDF
    Empirical Bayes Estimates for Large-Scale Prediction Problems
    • B. Efron
    • Mathematics, Medicine
    • Journal of the American Statistical Association
    • 2009
    • 105
    • Highly Influential
    • PDF
    Gene ranking and biomarker discovery under correlation
    • 92
    • PDF
    Optimality Driven Nearest Centroid Classification from Genomic Data
    • 32
    • Highly Influential
    • PDF
    Modified linear discriminant analysis approaches for classification of high-dimensional microarray data
    • 90
    • Highly Influential
    • PDF
    Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach
    • 187
    • PDF