• Corpus ID: 88519801

When Does More Regularization Imply Fewer Degrees of Freedom? Sufficient Conditions and Counter Examples from Lasso and Ridge Regression

  title={When Does More Regularization Imply Fewer Degrees of Freedom? Sufficient Conditions and Counter Examples from Lasso and Ridge Regression},
  author={Shachar Kaufman and Saharon Rosset},
  journal={arXiv: Statistics Theory},
Regularization aims to improve prediction performance of a given statistical modeling approach by moving to a second approach which achieves worse training error but is expected to have fewer degrees of freedom, i.e., better agreement between training and prediction error. We show here, however, that this expected behavior does not hold in general. In fact, counter examples are given that show regularization can increase the degrees of freedom in simple situations, including lasso and ridge… 

Figures from this paper

On Degrees of Freedom of Projection Estimators With Applications to Multivariate Nonparametric Regression

Abstract–In this article, we consider the nonparametric regression problem with multivariate predictors. We provide a characterization of the degrees of freedom and divergence for estimators of the

Degrees of freedom and model search

Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed by a given procedure. But, despite this fundamental

Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons

An expanded set of simulations is presented to shed more light on empirical comparisons of best subset with other popular variable selection procedures, in particular, the lasso and forward stepwise selection, suggesting that best subset consistently outperformed both methods in terms of prediction accuracy.

Effective degrees of freedom : a flawed metaphor

To most applied statisticians, a fitting procedure’s degrees of freedom is synonymous with its model complexity, or its capacity for overfitting to data. In particular, it is often used to

Effective degrees of freedom: a flawed metaphor.

This work exhibits and theoretically explore various fitting procedures for which degrees of freedom is not monotonic in the model complexity parameter, and can exceed the total dimension of the ambient space even in very simple settings.

Statistical Learning as a Regression Problem

  • R. Berk
  • Biology, Economics
    Statistical Learning from a Regression Perspective
  • 2020
This chapter sets the much of the conceptual stage for later chapters by discussing the ways in which the statistical learning can differ from convention regression analysis.

Are we far from correctly inferring gene interaction networks with Lasso?

This work reviews nine penalised regression methods applied to microarray data to infer the topology of the network of interactions and analyses the limitations of each in order to suggest a number of precautions that should be considered to make their predictions more significant and reliable.



The Degrees of Freedom of Partial Least Squares Regression

This work studies the intrinsic complexity of partial least squares regression and shows that the degrees of freedom depend on the collinearity of the predictor variables: the lower the coll inearity, the higher the complexity.

Regression Shrinkage and Selection via the Lasso

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

On the “degrees of freedom” of the lasso

The number of nonzero coefficients is an unbiased estimate for the degrees of freedom of the lasso—a conclusion that requires no special assumption on the predictors and the unbiased estimator is shown to be asymptotically consistent.


For the problem of estimating a regression function, μ say, subject to shape constraints, like monotonicity or convexity, it is argued that the divergence of the maximum likelihood estimator provides

On the degrees of freedom in shrinkage estimation

Least angle regression

A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

Regularization and variable selection via the elastic net

It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.

The Estimation of Prediction Error

A Rao–Blackwell type of relation is derived in which nonparametric methods such as cross-validation are seen to be randomized versions of their covariance penalty counterparts.

The solution path of the generalized lasso

This work derives an unbiased estimate of the degrees of freedom of the generalized lasso fit for an arbitrary D, which turns out to be quite intuitive in many applications.

On Measuring and Correcting the Effects of Data Mining and Model Selection

The concept of GDF offers a unified framework under which complex and highly irregular modeling procedures can be analyzed in the same way as classical linear models and many difficult problems can be solved easily.