Prediction by supervised principal components

Abstract

In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors that are selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer.

Extracted Key Phrases

13 Figures and Tables

0204060'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

453 Citations

Semantic Scholar estimates that this publication has 453 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Bair2006PredictionBS, title={Prediction by supervised principal components}, author={Eric Bair and Trevor J. Hastie and Debashis Paul and Robert Tibshirani}, year={2006} }