Least Angle Regression

Abstract

The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method; this connection explains the similar numerical results previously observed for the Lasso and Stagewise, and helps us understand the properties of both methods, which are seen as constrained versions of the simpler LARS algorithm. (3) A simple approximation for the degrees of freedom of a LARS estimate is available, from which we derive a Cp estimate of prediction error; this allows a principled choice among the range of possible LARS estimates. LARS and its variants are computationally efficient: the paper describes a publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates. AMS 2000 subject classification. 62J07. 1. Introduction. Automatic model-building algorithms are familiar, and sometimes notorious, in the linear model literature: Forward Selection, Backward Elimination, All Subsets regression and various combinations are used to automatically produce " good " linear models for predicting a response y on the basis of some measured covariates x 1 , x 2 ,. .. , x m. Goodness is often defined in terms of prediction accuracy, but parsimony is another important criterion: simpler models are preferred for the sake of scientific insight into the x − y relationship. Two promising recent model-building algorithms, the Lasso and Forward Stagewise linear regression, will be discussed here, and motivated in terms of a computationally simpler method called Least Angle Regression. Least Angle Regression (LARS) relates to the classic model-selection method known as Forward Selection, …

Extracted Key Phrases

Showing 1-10 of 11 references

A new approach to variable selection in least squares problems On the LASSO and its dual

  • M Osborne, B Presnell, B Turlach
  • 2000

On the degrees of freedom in shape-restricted regression

  • M Meyer, M Woodroofe
  • 2000
1 Excerpt

How biased is the apparent error rate of a prediction rule?

  • B Efron
  • 1986

Estimation of the mean of a multivariate normal distribution

  • C Stein
  • 1981
1 Excerpt

Applied Linear Regression On measuring and correcting the effects of data mining and model selection

  • S Weisberg, J Mr591462 Ye
  • 1980

Linear Statistical Inference and Its Applications

  • C R Rao
  • 1973
1 Excerpt
Showing 1-10 of 2,285 extracted citations
0200400600'03'05'07'09'11'13'15'17
Citations per Year

6,544 Citations

Semantic Scholar estimates that this publication has received between 5,981 and 7,150 citations based on the available data.

See our FAQ for additional information.