# Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations

@article{Yan2017HierarchicalSM, title={Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations}, author={Xiaohan Yan and Jacob Bien}, journal={Statistical Science}, year={2017}, volume={32}, pages={531-560} }

Demanding sparsity in estimated models has become a routine
practice in statistics. In many situations, we wish to require that the sparsity
patterns attained honor certain problem-specific constraints. Hierarchical
sparse modeling (HSM) refers to situations in which these constraints
specify that one set of parameters be set to zero whenever another is set to
zero. In recent years, numerous papers have developed convex regularizers
for this form of sparsity structure, which arises in…

## Figures and Tables from this paper

## 35 Citations

Learning Local Dependence In Ordered Data

- Computer ScienceJ. Mach. Learn. Res.
- 2017

This work proposes a framework for learning local dependence based on estimating the inverse of the Cholesky factor of the covariance matrix, which yields a simple regression interpretation for local dependence in which variables are predicted by their neighbors.

Graph-Guided Banding of the Covariance Matrix

- MathematicsJournal of the American Statistical Association
- 2018

This work proposes a generalization of the notion of bandedness that greatly expands the range of problems in which banded estimators apply, and develops convex regularizers occupying the broad middle ground between the former approach of “patternless sparsity” and the latter reliance on having a known ordering.

A Pliable Lasso

- Computer ScienceJournal of Computational and Graphical Statistics
- 2019

A generalization of the lasso that allows the model coefficients to vary as a function of a general set of some prespecified modifying variables, which might be variables such as gender, age, or time is proposed.

LassoNet: A Neural Network with Feature Sparsity

- Computer ScienceJ. Mach. Learn. Res.
- 2021

LassoNet is introduced, a neural network framework with global feature selection that uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly, and delivers an entire regularization path of solutions with a range of feature sparsity.

LassoNet: Neural Networks with Feature Sparsity

- Computer ScienceAISTATS
- 2021

This work introduces LassoNet, a neural network framework with global feature selection that uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly, and delivers an entire regularization path of solutions with a range of feature sparsity.

Learning With Subquadratic Regularization : A Primal-Dual Approach

- Computer Science, MathematicsIJCAI
- 2020

This paper describes the efficiency of the algorithms developed in the context of tree-structured sparsity, where they comprehensively outperform relevant baselines and achieve a proven rate of convergence of O(1/T ) after T iterations.

Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages

- Computer Science, EconomicsJournal of the American Statistical Association
- 2021

Novel theoretical machinery includes non-asymptotic analysis of infinite-order VAR, elastic net estimation under a singular covariance structure of regressors, and new concentration inequalities for quadratic forms of random variables from Gaussian time series.

Learning with Structured Sparsity: From Discrete to Convex and Back.

- Computer Science
- 2018

A new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization are introduced, and an in-depth study of the geometric and statistical properties of convex Relaxations of general combinatorial structures is presented.

The LASSO on latent indices for regression modeling with ordinal categorical predictors

- Computer ScienceComput. Stat. Data Anal.
- 2020

Graphical-model based high dimensional generalized linear models

- Computer Science
- 2021

A graphic model-based doubly sparse regularized estimator is discussed under the high dimensional generalized linear models, that utilizes the graph structure among the predictors, that allows the dimensionality grows exponentially with the sample size.

## References

SHOWING 1-10 OF 36 REFERENCES

A Sparse-Group Lasso

- Computer Science
- 2013

A regularized model for linear regression with ℓ1 andℓ2 penalties is introduced and it is shown that it has the desired effect of group-wise and within group sparsity.

Proximal methods for the latent group lasso penalty

- Computer ScienceComput. Optim. Appl.
- 2014

This paper considers a regularized least squares problem, with regularization by structured sparsity-inducing norms, which extend the usual ℓ1 and the group lasso penalty, by allowing the subsets to overlap, and devise a new active set strategy that allows to deal with high dimensional problems without pre-processing for dimensionality reduction.

Group Lasso with Overlaps: the Latent Group Lasso approach

- Computer ScienceArXiv
- 2011

We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlapping groups of variables. We call the obtained formulation latent group…

Structured sparsity through convex optimization

- Computer ScienceArXiv
- 2011

It is shown that the $\ell_1$-norm can be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures.

Group Regularized Estimation Under Structural Hierarchy

- Computer Science
- 2014

This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony and shows that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection.

Sparse estimation of large covariance matrices via a nested Lasso penalty

- Computer Science
- 2008

The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering, and imposes a banded structure on the Cholesky factor, using a novel penalty called nested Lasso, which results in a sparse estimators for the inverse of the covariance matrix.

The composite absolute penalties family for grouped and hierarchical variable selection

- Computer Science
- 2009

CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.

Convex Modeling of Interactions With Strong Heredity

- Computer Science, MathematicsJournal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
- 2016

FAMILY is a generalization of several existing methods, such as VANISH, hierNet, the all-pairs lasso, and the lasso using only main effects, formulated as the solution to a convex optimization problem, which is solved using an efficient alternating directions method of multipliers (ADMM) algorithm.

Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials

- Computer ScienceAISTATS
- 2010

This work uses a spectral projected gradient method as a subroutine for solving the overlapping group ‘1regularization problem, and makes use of a sparse version of Dykstra’s algorithm to compute the projection.

Proximal Methods for Hierarchical Sparse Coding

- Computer ScienceJ. Mach. Learn. Res.
- 2011

The procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the l1-norm.