• Corpus ID: 4857366

Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables

  title={Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables},
  author={Masaaki Takada and Taiji Suzuki and Hironori Fujisawa},
Sparse regularization such as $\ell_1$ regularization is a quite powerful and widely used strategy for high dimensional learning problems. The effectiveness of sparse regularization has been supported practically and theoretically by several studies. However, one of the biggest issues in sparse regularization is that its performance is quite sensitive to correlations between features. Ordinary $\ell_1$ regularization can select variables correlated with each other, which results in… 

Figures and Tables from this paper

Independently Interpretable Lasso for Generalized Linear Models
This letter proposes a new regularization method, independently interpretable lasso (IILasso), for generalized linear models, which can interpret regression coefficients intuitively, and the performance is also improved by avoiding overfitting.
$\ell_P$ Norm Independently Interpretable Regularization Based Sparse Coding for Highly Correlated Data
The experiment results show that all independently interpretable algorithms can perform better than their original ones in different coherence conditions and IIWLasso can obtain relatively best performance both in relative norm error and support error of synthetic data and misclassification error of tenfold cross-validating gene expression data.
Stable Learning via Differentiated Variable Decorrelation
This paper incorporates the unlabled data from multiple environments into the variable decorrelation framework and proposes a Differentiated Variable Decorrelation (DVD) algorithm based on the clustering of variables that improves the model parameter estimation and the prediction stability over changing distributions.
Stable Prediction with Model Misspecification and Agnostic Distribution Shift
A novel Decorrelated Weighting Regression algorithm which jointly optimizes a variable decorrelation regularizer and a weighted regression model to improve the accuracy of parameter estimation and stability of prediction with model misspecification and agnostic distribution shift.
Deep Stable Learning for Out-Of-Distribution Generalization
This paper proposes to address the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels.
Stable Learning via Sample Reweighting
A sample reweighting method that reduces collinearity among input variables and can be seen as a pretreatment of data to improve the condition of design matrix, and it can be combined with any standard learning method for parameter estimation and variable selection.
Out-of-distribution Generalization via Partial Feature Decorrelation
A novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimizes a feature decomposition network and the target image classification model, which can improve the backbone model’s accuracy on OOD image classification datasets.
Optimization Induced Equilibrium Networks: An Explicit Optimization Perspective for Understanding Equilibrium Models.
This paper decomposes DNNs into a new class of unit layer that is the proximal operator of an implicit convex function while keeping its output unchanged, and derives the equilibrium model of the unit layer, which is derived and named OptEq, which outperforms previous implicit models even with fewer parameters.
CompNet: Neural networks growing via the compact network morphism
This work morphs a well-trained neural network to a deeper one where network function can be preserved and the added layer is compact, unlike previous network morphism approaches which tend to add more neurons or channels beyond the actual requirements and result in redundance of the model.


Restricted Eigenvalue Properties for Correlated Gaussian Designs
This paper proves directly that the restricted nullspace and eigenvalue conditions hold with high probability for quite general classes of Gaussian matrices for which the predictors may be highly dependent, and hence restricted isometry conditions can be violated with high probabilities.
Trace Lasso: a trace norm regularization for correlated designs
This paper introduces a new penalty function which takes into account the correlation of the design matrix to stabilize the estimation of the parameter vector of a linear model, and uses the trace norm of the selected covariates as the criterion of model complexity.
Regression Shrinkage and Selection via the Lasso
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso)
  • M. Wainwright
  • Computer Science
    IEEE Transactions on Information Theory
  • 2009
This work analyzes the behavior of l1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern of a vector beta* based on observations contaminated by noise, and establishes precise conditions on the problem dimension p, the number k of nonzero elements in beta*, and the number of observations n.
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Regularization and variable selection via the elastic net
It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Nearly unbiased variable selection under minimax concave penalty
It is proved that at a universal penalty level, the MC+ has high probability of matching the signs of the unknowns, and thus correct selection, without assuming the strong irrepresentable condition required by the LASSO.
Uncorrelated Lasso
An effective iterative algorithm, with the proof of convergence, is presented to solve the sparse optimization problem and shows that the proposed method has better classification performance than many state-of-the-art variable selection methods.
The potential of coordinate descent algorithms for fitting models, establishing theoretical convergence properties and demonstrating that they are significantly faster than competing approaches are demonstrated, and the numerical results suggest that MCP is the preferred approach among the three methods.
Exclusive Feature Learning on Arbitrary Structures via \ell_{1, 2}-norm
This paper proposes a new formulation of exclusive group LASSO, which brings out sparsity at intra-group level in the context of feature selection, and proposes an effective iteratively re-weighted algorithm to solve the corresponding optimization problem with rigorous convergence analysis.