# Sparsity and smoothness via the fused lasso

@article{Tibshirani2005SparsityAS, title={Sparsity and smoothness via the fused lasso}, author={Robert Tibshirani and Michael A. Saunders and Saharon Rosset and Ji Zhu and Keith Knight}, journal={Journal of The Royal Statistical Society Series B-statistical Methodology}, year={2005}, volume={67}, pages={91-108} }

Summary. The lasso penalizes a least squares regression by the sum of the absolute values (L1-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficients equal to 0). We propose the ‘fused lasso’, a generalization that is designed for problems with features that can be ordered in some meaningful way. The fused lasso penalizes the L1-norm of both the coefficients and their successive differences. Thus it encourages sparsity of the coefficients and also… Expand

#### Figures and Tables from this paper

#### 2,173 Citations

The Smooth-Lasso and other l 1 + l 2-penalized methods

- 2012

We consider a linear regression problem in a high dimensional setting where the number of covariates p can be much larger than the sample size n. In such a situation, one often assumes sparsity of… Expand

Localized Lasso for High-Dimensional Regression

- Mathematics, Computer Science
- AISTATS
- 2017

The localized Lasso is introduced, which is suited for learning models that are both interpretable and have a high predictive power in problems with high dimensionality and small sample size, and a simple yet efficient iterative least-squares based optimization procedure is proposed. Expand

Fused Lasso Screening Rules via the Monotonicity of Subdifferentials

- Mathematics, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2015

Novel screening rules that are able to quickly identity the adjacent features with the same coefficients for fused Lasso are proposed that can be significantly reduced, leading to substantial savings in computational cost and memory usage. Expand

A framework to efficiently smooth L1 penalties for linear regression

- Biology, Computer Science
- 2020

A unified framework to compute closed-form smooth surrogates of a whole class of L1 penalized regression problems using Nesterov smoothing, which proves that the estimates obtained can be made arbitrarily close to the ones of the original (unsmoothed) objective functions, and provides explicitly computable a priori error bounds on the accuracy of the estimates. Expand

Regularization with the Smooth-Lasso procedure

- Mathematics
- 2008

We consider the linear regression problem. We propose the S-Lasso procedure to estimate the unknown regression parameters. This estimator enjoys sparsity of the representation while taking into… Expand

A Path Algorithm for the Fused Lasso Signal Approximator

- Mathematics
- 2009

The Lasso is a very well-known penalized regression model, which adds an L1 penalty with parameter λ1 on the coefficients to the squared error loss function. The Fused Lasso extends this model by… Expand

Bayesian generalized fused lasso modeling via NEG distribution

- Mathematics
- 2016

Abstract The fused lasso penalizes a loss function by the L1 norm for both the regression coefficients and their successive differences to encourage sparsity of both. In this paper, we propose a… Expand

Split Bregman method for large scale fused Lasso

- Mathematics, Computer Science
- Comput. Stat. Data Anal.
- 2011

This paper proposes an iterative algorithm based on the split Bregman method to solve a class of large-scale fused Lasso problems, including a generalized fusedLasso and a fused Lizza support vector classifier, and derives its algorithm using an augmented Lagrangian method and proves its convergence properties. Expand

An efficient algorithm for a class of fused lasso problems

- Mathematics, Computer Science
- KDD
- 2010

This paper proposes an Efficient Fused Lasso Algorithm (EFLA) and designs a restart technique to accelerate the convergence of SFA, by exploiting the special "structures" of both the original and the reformulated FLSA problems. Expand

Efficient Sparse Semismooth Newton Methods for the Clustered Lasso Problem

- Mathematics, Computer Science
- SIAM J. Optim.
- 2019

The experiments show that the {\sc Ssnal} algorithm substantially outperforms the best alternative algorithm for the clustered lasso problem and an efficient procedure for its computation is derived. Expand

#### References

SHOWING 1-10 OF 32 REFERENCES

Regression Shrinkage and Selection via the Lasso

- Mathematics
- 1996

SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a… Expand

Boosting as a Regularized Path to a Maximum Margin Classifier

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2004

It is built on recent work by Efron et al. to show that boosting approximately (and in some cases exactly) minimizes its loss criterion with an l1 constraint on the coefficient vector, and shows that as the constraint is relaxed the solution converges (in the separable case) to an "l1-optimal" separating hyper-plane. Expand

Asymptotics for lasso-type estimators

- Mathematics
- 2000

We consider the asymptotic behavior of regression estimators that minimize the residual sum of squares plus a penalty proportional to Σ ∥β j ∥γ for some y > 0. These estimators include the Lasso as a… Expand

Atomic Decomposition by Basis Pursuit

- Mathematics, Computer Science
- SIAM J. Sci. Comput.
- 1998

Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions. Expand

Ridge Regression: Biased Estimation for Nonorthogonal Problems

- Computer Science
- Technometrics
- 2000

The ridge trace is introduced is the ridge trace, a method for showing in two dimensions the effects of nonorthogonality, and how to augment X′X to obtain biased estimates with smaller mean square error. Expand

Ideal spatial adaptation by wavelet shrinkage

- Mathematics
- 1994

SUMMARY With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline,… Expand

A training algorithm for optimal margin classifiers

- Mathematics, Computer Science
- COLT '92
- 1992

A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions,… Expand

Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (NIPALS) Approach

- Mathematics
- 1975

The NIPALS approach is applied to the ‘soft’ type of model that has come to the fore in sociology and other social sciences in the last five or ten years, namely path models that involve latent… Expand

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

- Mathematics
- 2004

In the words of the authors, the goal of this book was to “bring together many of the important new ideas in learning, and explain them in a statistical framework.” The authors have been quite… Expand

1-norm Support Vector Machines

- Computer Science
- NIPS
- 2003

The standard 2-norm SVM is known for its good performance in two-class classification. In this paper, we consider the 1-norm SVM. We argue that the 1-norm SVM may have some advantage over the… Expand