Statistical Learning with Sparsity: The Lasso and Generalizations

  title={Statistical Learning with Sparsity: The Lasso and Generalizations},
  author={Trevor J. Hastie and Robert Tibshirani and Martin J. Wainwright},
Discover New Methods for Dealing with High-Dimensional Data A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. [] Key Method They discuss the application of 1 penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization.

Group selection and shrinkage: Structured sparsity for semiparametric models

This work introduces structured sparse estimators that combine group subset selection with shrinkage that are an excellent compromise between fully linear and fully nonparametric alternatives.

Introduction to the LASSO

This article aims to provide an introduction of the LASSO method as a constrained quadratic programming problem, and to discuss the convex optimization based approach to solve the L ASSO problem.

An Asymptotic Rate for the LASSO Loss

This work provides large deviations bounds between various measures of LASSO loss and their concentrating values predicted by the AMP state evolution that shows exponentially fast convergence (in n) when the measurement matrix X is i.i.d. Gaussian.

Hyperparameter selection for group-sparse regression: A probabilistic approach

Flexible Low-Rank Statistical Modeling with Missing Data and Side Information

A general statistical framework for low-rank modeling of matrix-valued data, based on convex optimization with a generalized nuclear norm penalty, is explored, showing that this approach encompasses maximum a posteriori estimation arising from Bayesian hierarchical modeling with latent factors.

T-LoHo: A Bayesian Regularization Model for Structured Sparsity and Smoothness on Graphs

A new prior for high-dimensional parameters with graphical relations is proposed, referred to as the Treebased Low-rank Horseshoe (T-LoHo) model, that generalizes the popular univariate Bayesian horseshoe shrinkage prior to the multivariate setting to detect structured sparsity and smoothness simultaneously.

The False Positive Control Lasso

An existing model (the SQRT-Lasso) can be recast as a method of controlling the number of expected false positives, how a similar estimator can be used for all other generalized linear model classes, and this approach can be fit with existing fast Lasso optimization solvers.

Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression

A new broad ensemble of dependent random design matrices that have an explicit RE bound is constructed, which allows incorporating a fixed matrix that has an easily verifiable condition into the design process, and allows for generation of designs that have a lower storage requirement than a standard design matrix.

High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection

The regularization terms responsible for inducing coefficient shrinkage and variable selection leading to improved performance metrics of these regression models are discussed, making these modern, computational regression models valuable tools for analyzing high-dimensional problems.

Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays

A new approach is proposed for tipping GLLVMs to high-dimensional data sets, based on approximating the model using penalized quasi-likelihood and then using a Newton method and Fisher scoring to learn the model parameters.



Adaptive Lasso for sparse high-dimensional regression models

The adaptive Lasso has the oracle property even when the number of covariates is much larger than the sample size, and under a partial orthogonality condition in which the covariates with zero coefficients are weakly correlated with the covariate with nonzero coefficients, marginal regression can be used to obtain the initial estimator.

Sparse Modeling: Theory, Algorithms, and Applications

An introduction to the growing field of sparse modeling, including application examples, problem formulations that yield sparse solutions, algorithms for finding such solutions, and recent theoretical results on sparse recovery is provided.

The Graphical Lasso: New Insights and Alternatives

This paper explains how GLASSO is solving the dual of the graphical lasso penalized likelihood, by block coordinate ascent, and proposes similar primal algorithms P-GLASSO and DP-GLassO, that also operate by block-coordinate descent, where Θ is the optimization target.

Sparse permutation invariant covariance estimation

A method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings using a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty is proposed.

Sparse estimation of a covariance matrix.

The proposed penalized maximum likelihood problem is not convex, so the method can be used to solve a previously studied special case in which a desired sparsity pattern is prespecified, and it uses a majorize-minimize approach in which it iteratively solve convex approximations to the original nonconvex problem.

A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

A unified framework for establishing consistency and convergence rates for regularized M-estimators under high-dimensional scaling is provided and one main theorem is state and shown how it can be used to re-derive several existing results, and also to obtain several new results.

The group lasso for logistic regression

An efficient algorithm is presented, that is especially suitable for high dimensional problems, which can also be applied to generalized linear models to solve the corresponding convex optimization problem.

A Sparse-Group Lasso

A regularized model for linear regression with ℓ1 andℓ2 penalties is introduced and it is shown that it has the desired effect of group-wise and within group sparsity.

New Insights and Faster Computations for the Graphical Lasso

A very simple necessary and sufficient condition can be employed to determine whether the estimated inverse covariance matrix will be block diagonal, and if so, then to identify the blocks in the graphical lasso solution.

Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions

A general theorem is derived that bounds the Frobenius norm error for an estimate of the pair of high-dimensional matrix decomposition problems obtained by solving a convex optimization problem that combines the nuclear norm with a general decomposable regularizer.