• Corpus ID: 88520317

Lecture notes on ridge regression

@article{Wieringen2015LectureNO,
  title={Lecture notes on ridge regression},
  author={Wessel N. van Wieringen},
  journal={arXiv: Methodology},
  year={2015}
}
  • W. N. Wieringen
  • Published 30 September 2015
  • Mathematics, Computer Science
  • arXiv: Methodology
The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to… 

Figures and Tables from this paper

HDBRR: A Statistical Package for High Dimensional Ridge Regression without MCMC
TLDR
A computational algorithm to obtain posterior estimates of regression parameters, variance components and predictions for the conventional ridge Regression model is proposed, based on a reparameterization of the model which allows us to obtain the marginal posterior means and variances by integrating out a nuisance parameter whose marginal posterior is defined on the open interval.
Particle swarm optimization based ridge logistic estimator
TLDR
A new alternative approach based on particle swarm optimization is introduced to obtain an optimal shrinkage parameter in ridge logistic regression, and the performance of the new approach is evaluated by simulation studies and a real dataset application.
Inference for the Linear IV Model Ridge Estimator Using Training and Test Samples
The asymptotic distribution is presented for the linear instrumental variables model estimated with a ridge penalty and a prior where the tuning parameter is selected with a holdout sample. The
Cluster Regularization via a Hierarchical Feature Regression
TLDR
A novel cluster-based regularization — the hierarchical feature regression (HFR) — is proposed, which mobilizes insights from the domains of machine learning and graph theory to estimate parameters along a supervised hierarchical representation of the predictor set, shrinking parameters towards group targets.
Optimum shrinkage parameter selection for ridge type estimator of Tobit model
This paper presents different ridge type estimators based on maximum likelihood ( ) for parameters of a Tobit model. In this context, an algorithm is introduced to get the estimators based on . The
Fridge: Focused fine‐tuning of ridge regression for personalized predictions
TLDR
The focused ridge—fridge—procedure is introduced with a 2‐part contribution: an oracle tuning parameter minimizing the mean squared prediction error of a specific covariate vector is defined, and the procedure is extended to logistic ridge regression by using parametric bootstrap.
Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives
TLDR
It is shown that \acrshort{fd} can be used in the optimization setting through an iterative scheme which yields high-accuracy solutions, which improves on randomized approaches which need to compromise the need for a new sketch every iteration with speed of convergence.
yaglm: a Python package for fitting and tuning generalized linear models that supports structured, adaptive and non-convex penalties
TLDR
The yaglm package aims to make the broader ecosystem of modern generalized linear models accessible to data analysts and researchers and comes with a variety of tuning parameter selection methods including: cross-validation, information criteria that have favorable model selection properties, and degrees of freedom estimators.
Function Estimation via Reconstruction
TLDR
It is shown that, the reconstruction idea not only provides different angles to look into existing methods, but also produces new effective experimental design and estimation methods for nonparametric models.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 60 REFERENCES
Ridge Estimators in Logistic Regression
TLDR
It is shown how ridge estimators can be used in logistic regression to improve the parameter estimates and to diminish the error made by further predictions and to predict new observations more accurately.
On the Practice of Rescaling Covariates
Whether doing parametric or nonparametric regression with shrinkage, thresholding, penalized likelihood, Bayesian posterior estimators (e.g., ridge regression, lasso, principal component regression,
Ridge regression: biased estimation for nonorthogonal problems
In multiple regression it is shown that parameter estimates based on minimum residual sum of squares have a high probability of being unsatisfactory, if not incorrect, if the prediction vectors are
Estimation in high-dimensional linear models with deterministic design matrices
TLDR
This work considers the ridge regression estimator of the projection vector and proposes to threshold the ridge regressor when the projectionvector is sparse in the sense that many of its components are small, which is a reasonable approach.
The Risk of James–Stein and Lasso Shrinkage
This article compares the mean-squared error (or ℓ2 risk) of ordinary least squares (OLS), James–Stein, and least absolute shrinkage and selection operator (Lasso) shrinkage estimators in simple
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
TLDR
In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
On the LASSO and its dual
TLDR
Consideration of the primal and dual problems together leads to important new insights into the characteristics of the LASSO estimator and to an improved method for estimating its covariance matrix.
Model selection and estimation in regression with grouped variables
Summary.  We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor
A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION
TLDR
It is shown that for any sample size n, when there are superuous variables in the linear regression model and the design matrix is orthogonal, the probability that these procedures correctly identify the true set of important variables is less than a constant not depending on n.
...
1
2
3
4
5
...