Regularization Paths for Generalized Linear Models via Coordinate Descent.

  title={Regularization Paths for Generalized Linear Models via Coordinate Descent.},
  author={Jerome H. Friedman and Trevor J. Hastie and Robert Tibshirani},
  journal={Journal of statistical software},
  volume={33 1},
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include ℓ(1) (the lasso), ℓ(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In… 

Figures and Tables from this paper

Distributed coordinate descent for generalized linear models with regularization

A novel algorithm for fitting regularized generalized linear models in the distributed environment, which splits data between nodes by features, uses coordinate descent on each node and line search to merge results globally and is scalable and superior when training on large and sparse datasets.


The orthogonal greedy algorithm is introduced and the resultant regression estimate is shown to have the oracle property of being equivalent to least squares regression on an asymptotically minimal set of relevant regressors under a strong sparsity condition.

Elastic Net Regularization Paths for All Generalized Linear Models

The reach of the elastic net-regularized regression is extended to all generalized linear model families, Cox models with (start, stop] data and strata, and a simplified version of the relaxed lasso.

An efficient algorithm for the non-convex penalized multinomial logistic regression

An efficient algorithm is introduced that can be uniformly applied to a class of non-convex penalties such as the smoothly clipped absolute deviation, minimax concave and bridge penalties and uses a uniform bound of the Hessian matrix in the quadratic approximation.

Lasso Regularization Paths for NARMAX Models via Coordinate Descent

A new algorithm for estimating NARMAX models with L1 regularization for models represented as a linear combination of basis functions, which can provide the most important regressors in very few inexpensive iterations.

A flexible empirical Bayes approach to multiple linear regression and connections with penalized regression

The posterior mean from the method can be interpreted as solving a penalized regression problem, with the precise form of the penalty function being learned from the data by directly solving an optimization problem (rather than being tuned by cross-validation).

A coordinate majorization descent algorithm for ℓ1 penalized learning

A family of coordinatemajorization descent algorithms for solving the ℓ1 penalized learning problems by replacing each coordinate descent step with a coordinate-wise majorization descent operation is considered.

A distributed algorithm for fitting generalized additive models

This work presents a distributed algorithm for fitting generalized additive models, based on the alternating direction method of multipliers (ADMM), in which the component functions of the model are fit independently, in parallel; a simple iteration yields convergence to the optimal generalized additive model.

Natural coordinate descent algorithm for L1-penalised regression in generalised linear models

Elements for Building Supervised Statistical Machine Learning Models

This chapter gives details of the linear multiple regression model including assumptions and some pros and cons, the maximum likelihood. Gradient descendent methods are described for learning the



L 1-regularization path algorithm for generalized linear models

A path following algorithm for L1-regularized generalized linear models that efficiently computes solutions along the entire regularization path by using the predictor–corrector method of convex optimization.

L1‐regularization path algorithm for generalized linear models

A path following algorithm for L1‐regularized generalized linear models that efficiently computes solutions along the entire regularization path by using the predictor–corrector method of convex optimization.

Sparse inverse covariance estimation with the graphical lasso.

Using a coordinate descent procedure for the lasso, a simple algorithm is developed that solves a 1000-node problem in at most a minute and is 30-4000 times faster than competing methods.

Piecewise linear regularized solution paths

We consider the generic regularized optimization problem β(λ) = argminβ L(y, Xβ) + λJ(β). Efron, Hastie, Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407-499] have shown that for the LASSO-that

Coordinate descent algorithms for lasso penalized regression

This paper tests two exceptionally fast algorithms for estimating regression coefficients with a lasso penalty and proves that a greedy form of the l 2 algorithm converges to the minimum value of the objective function.

An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

This paper describes an efficient interior-point method for solving large-scale l1-regularized logistic regression problems, and shows how a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.

Regression Shrinkage and Selection via the Lasso

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

Regularization and variable selection via the elastic net

It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.


It is shown that coordinate descent is very competitive with the well-known LARS procedure in large lasso problems, can deliver a path of solutions efficiently, and can be applied to many other convex statistical problems such as the garotte and elastic net.

Scalable training of L1-regularized log-linear models

This work presents an algorithm Orthant-Wise Limited-memory Quasi-Newton (OWL-QN), based on L-BFGS, that can efficiently optimize the L1-regularized log-likelihood of log-linear models with millions of parameters.