Some cautionary notes on the use of principal components regression

  title={Some cautionary notes on the use of principal components regression},
  author={Ali S. Hadi and Robert F. Ling},
  journal={The American Statistician},
Abstract Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known… 
An Account of Principal Components Analysis and Some Cautions on Using the Correct Formulas and the Correct Procedures in SPSS
The paper provides an account of the principal components regression (PCR) and uses some examples from the literature to illustrate the following: (1) the importance of PCR in the presence of
Pre-tuned Principal Component Regression and Several Variants
The regression coefficient estimates from ordinary least squares (OLS) have a low probability of being close to the real value when there is a multicollinearity problem in the design matrix. In order
Sequential Regression : A Neodescriptive Approach to Multicollinearity 1
Classical regression analysis uses partial coefficients to measure the influences of some variables (regressors) on another variable (regressand). However, a descriptive point of view shows that
A New Approach of Principal Component Regression Estimator with Applications to Collinear Data
In this paper, a new approach to estimating the model parameters in Principal Components (PCs) is developed. Principal Component Analysis (PCA) is a method of variable reduction that has found
Contributions to Linear Regression diagnostics using the singular value decompostion: Measures to Indentify Outlying Observations, Influential Observations and Collinearity in Multivariate Data
This thesis discusses the use of the singular value decomposition (SVD) in multiple regression with special reference to the problems of identifying outlying and/or influential observations, and the
Sequential Regression: A Neodescriptive Approach to Multicollinearity
Classical regression analysis uses partial coefficients to measure the influences of some variables (regressors) on another variable (regressand). However, a descriptive point of view shows that
Procedure for the Selection of Principal Components in Principal Components Regression
Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the
A Projection Algorithm for Regression with Collinearity
Principal component regression (PCR) is often used in regression with multicollinearity. Although this method avoids the problems which can arise in the least squares (LS) approach, it is not
Linear Regression Analysis
After Part I, the text becomes mainly an application guide for SIMCA-P and MODDE, and the basic message is that when one has a multivariate problem, do either a PCA analysis or a PLS analysis, or both.
The principal correlation components estimator and its optimality
In regression analysis, the principal components regression estimator (PCRE) is often used to alleviate the effect of multicollinearity by deleting the principal components variables with smaller


An Analytic Variable Selection Technique for Principal Component Regression
SUMMARY This paper presents an analytic technique for deleting predictor variables from a linear regression model when principal components of X'X are removed to adjust for multicollinearities in the
A Note on the Use of Principal Components in Regression
The use of principal components in regression has received a lot of attention in the literature in the past few years, and the topic is now beginning to appear in textbooks. Along with the use of
Two Case Studies in the Application of Principal Component Analysis
There is a need for the extensive application of the present methods of multivariate analysis, including principal component analysis, over a wide range of problems and subjects, in order to test the practical value of the techniques.
The optimal set of principal component restrictions on a least-squares regression
When a researcher is confronted with multicollinearity in the standard linear model, he should consider restrictions his estimates by the linear restriction implied by the dilation of that set of
Applied regression analysis (2. ed.)
This book brings together a number of procedures developed for regression problems in current use and includes material that either has not previously appeared in a textbook or if it has appeared is not generally available.
The relations of the newer multivariate statistical methods to factor analysis.
. A survey of developments in multivariate analysis during the last thirty years shows that some, though not all, of the purposes for which factor analysis has been used may now be better
Discarding Variables in a Principal Component Analysis. Ii: Real Data
In this paper it is shown for four sets of real data, all published examples of principal component analysis, that the number of variables used can be greatly reduced with little effect on the
On the Investigation of Alternative Regressions by Principal Component Analysis
In a multiple regression problem, let the p × 1 vector x consist of the dependent variable and p – 1 predictor variables. The correlation matrix of x is reduced to principal components. The
Discarding Variables in a Principal Component Analysis. I: Artificial Data
It is shown that several of the rejection methods, of differing types, each discard precisely those variables known to be redundant, for all but a few sets of data.
Biased Estimation in Regression: An Evaluation Using Mean Squared Error
Abstract A mean squared error criterion is used to compare five estimators of the coefficients in a linear regression model: least squares, principal components, ridge regression, latent root, and a