A LASSO FOR HIERARCHICAL INTERACTIONS.

@article{Bien2013ALF,
  title={A LASSO FOR HIERARCHICAL INTERACTIONS.},
  author={Jacob Bien and Jonathan E. Taylor and Robert Tibshirani},
  journal={Annals of statistics},
  year={2013},
  volume={41 3},
  pages={
          1111-1141
        }
}
We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise characterization of the effect of this hierarchy constraint, prove that hierarchy holds with probability one and derive an unbiased estimate for the degrees of freedom of our estimator. A bound on this estimate reveals the amount of fitting "saved" by the… 

Figures from this paper

A Pliable Lasso
TLDR
A generalization of the lasso that allows the model coefficients to vary as a function of a general set of some prespecified modifying variables, which might be variables such as gender, age, or time is proposed.
Learning Interactions via Hierarchical Group-Lasso Regularization
  • Michael LimT. Hastie
  • Computer Science
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2015
We introduce a method for learning pairwise interactions in a linear regression or logistic regression model in a manner that satisfies strong hierarchy: whenever an interaction is estimated to be
Convex Modeling of Interactions With Strong Heredity
  • Asad HarisD. WittenN. Simon
  • Computer Science, Mathematics
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2016
TLDR
FAMILY is a generalization of several existing methods, such as VANISH, hierNet, the all-pairs lasso, and the lasso using only main effects, formulated as the solution to a convex optimization problem, which is solved using an efficient alternating directions method of multipliers (ADMM) algorithm.
Learning interactions through hierarchical group-lasso regularization
TLDR
A method for learning pairwise interactions in a manner that satisfies strong hierarchy: whenever an interaction is estimated to be nonzero, both its associated main effects are also included in the model, which results in interpretable interaction models.
Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations
TLDR
A side-by-side comparison of these two frameworks for HSM in terms of their statistical properties and computational efficiency is provided, with special attention to GL’s more aggressive shrinkage of parameters deep in the hierarchy, a property not shared by LOG.
An analysis of penalized interaction models
TLDR
A unified analysis on the convergence rate for a class of estimators, when the design satisfies the restricted eigenvalue condition is presented and it is shown that with probability tending to one, the resulting estimates have a rate of convergence in the error.
Group Regularized Estimation Under Structural Hierarchy
TLDR
This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony and shows that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection.
Lasso estimation of hierarchical interactions for analyzing heterogeneity of treatment effect
TLDR
A constrained Lasso approach with two parameterization schemes that enforce the hierarchical interaction restriction differently and which provide a well‐suited approach for doing secondary analysis in clinical trials to analyze heterogeneous treatment effects and to identify predictive biomarkers.
Ranked sparsity: a cogent regularization framework for selecting and estimating feature interactions and polynomials
TLDR
It is suggested a modeling strategy that requires a stronger level of evidence in order to allow certain variables to be selected in the final model, and can be implemented with the sparsity-ranked lasso (SRL).
Model selection of hierarchically structured covariates using elastic net
TLDR
It is shown that the proposed estimation approach has a hierarchical grouping property, namely, two highly correlated covariates that are close to each other in the hierarchical tree will be more likely included or excluded together in the model than those which are far away.
...
...

References

SHOWING 1-10 OF 60 REFERENCES
Degrees of freedom in lasso problems
TLDR
The degrees of freedom of the lasso fit is derived, placing no assumptions on the predictor matrix $X$ (and an arbitrary penalty matrix $D$), and some intermediate results on theLasso and generalized lasso are established that may be interesting on their own.
Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions
TLDR
This work introduces a new approach, “Variable selection using Adaptive Nonlinear Interaction Structures in High dimensions” (VANISH), that is based on a penalized least squares criterion and is designed for high dimensional nonlinear problems and suggests that VANISH should outperform certain natural competitors when the true interaction structure is sufficiently sparse.
Grouped and Hierarchical Model Selection through Composite Absolute Penalties
TLDR
This paper introduces the Composite Absolute Penalties (CAP) family which allows the grouping and hierarchical relationships between the predictors to be expressed and combined norms including L1 to form an intelligent penalty in order to add side information to the fitting of a regression or classification model to obtain reasonable estimates.
Regression Shrinkage and Selection via the Lasso
TLDR
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
On the “degrees of freedom” of the lasso
TLDR
The number of nonzero coefficients is an unbiased estimate for the degrees of freedom of the lasso—a conclusion that requires no special assumption on the predictors and the unbiased estimator is shown to be asymptotically consistent.
The composite absolute penalties family for grouped and hierarchical variable selection
TLDR
CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.
Group Lasso with Overlaps: the Latent Group Lasso approach
We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlapping groups of variables. We call the obtained formulation latent group
The Log-Linear Group Lasso Estimator and Its Asymptotic Properties
TLDR
The conditions guaranteeing that the group-lasso estimator is model selection consistent, in the sense that, with overwhelming probability as the sample size increases, it correctly identifies all the sets of nonzero interactions among the variables, are provided.
The solution path of the generalized lasso
TLDR
This work derives an unbiased estimate of the degrees of freedom of the generalized lasso fit for an arbitrary D, which turns out to be quite intuitive in many applications.
Hierarchical selection of variables in sparse high-dimensional regression
TLDR
It is shown that given an initial estimator, an estimator with a similar prediction loss but with a smaller number of non-zero coordinates can be found.
...
...