# Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations

@article{Yan2017HierarchicalSM,
title={Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations},
author={Xiaohan Yan and Jacob Bien},
journal={Statistical Science},
year={2017},
volume={32},
pages={531-560}
}
• Published 5 December 2015
• Computer Science
• Statistical Science
Demanding sparsity in estimated models has become a routine practice in statistics. In many situations, we wish to require that the sparsity patterns attained honor certain problem-specific constraints. Hierarchical sparse modeling (HSM) refers to situations in which these constraints specify that one set of parameters be set to zero whenever another is set to zero. In recent years, numerous papers have developed convex regularizers for this form of sparsity structure, which arises in…
35 Citations

## Figures and Tables from this paper

Learning Local Dependence In Ordered Data
• Computer Science
J. Mach. Learn. Res.
• 2017
This work proposes a framework for learning local dependence based on estimating the inverse of the Cholesky factor of the covariance matrix, which yields a simple regression interpretation for local dependence in which variables are predicted by their neighbors.
Graph-Guided Banding of the Covariance Matrix
• J. Bien
• Mathematics
Journal of the American Statistical Association
• 2018
This work proposes a generalization of the notion of bandedness that greatly expands the range of problems in which banded estimators apply, and develops convex regularizers occupying the broad middle ground between the former approach of “patternless sparsity” and the latter reliance on having a known ordering.
A Pliable Lasso
• Computer Science
Journal of Computational and Graphical Statistics
• 2019
A generalization of the lasso that allows the model coefficients to vary as a function of a general set of some prespecified modifying variables, which might be variables such as gender, age, or time is proposed.
LassoNet: A Neural Network with Feature Sparsity
• Computer Science
J. Mach. Learn. Res.
• 2021
LassoNet is introduced, a neural network framework with global feature selection that uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly, and delivers an entire regularization path of solutions with a range of feature sparsity.
LassoNet: Neural Networks with Feature Sparsity
• Computer Science
AISTATS
• 2021
This work introduces LassoNet, a neural network framework with global feature selection that uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly, and delivers an entire regularization path of solutions with a range of feature sparsity.
Learning With Subquadratic Regularization : A Primal-Dual Approach
• Computer Science, Mathematics
IJCAI
• 2020
This paper describes the efficiency of the algorithms developed in the context of tree-structured sparsity, where they comprehensively outperform relevant baselines and achieve a proven rate of convergence of O(1/T ) after T iterations.
Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages
• Computer Science, Economics
Journal of the American Statistical Association
• 2021
Novel theoretical machinery includes non-asymptotic analysis of infinite-order VAR, elastic net estimation under a singular covariance structure of regressors, and new concentration inequalities for quadratic forms of random variables from Gaussian time series.
Learning with Structured Sparsity: From Discrete to Convex and Back.
A new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization are introduced, and an in-depth study of the geometric and statistical properties of convex Relaxations of general combinatorial structures is presented.
Graphical-model based high dimensional generalized linear models
• Computer Science
• 2021
A graphic model-based doubly sparse regularized estimator is discussed under the high dimensional generalized linear models, that utilizes the graph structure among the predictors, that allows the dimensionality grows exponentially with the sample size.

## References

SHOWING 1-10 OF 36 REFERENCES
A Sparse-Group Lasso
• Computer Science
• 2013
A regularized model for linear regression with ℓ1 andℓ2 penalties is introduced and it is shown that it has the desired effect of group-wise and within group sparsity.
Proximal methods for the latent group lasso penalty
• Computer Science
Comput. Optim. Appl.
• 2014
This paper considers a regularized least squares problem, with regularization by structured sparsity-inducing norms, which extend the usual ℓ1 and the group lasso penalty, by allowing the subsets to overlap, and devise a new active set strategy that allows to deal with high dimensional problems without pre-processing for dimensionality reduction.
Group Lasso with Overlaps: the Latent Group Lasso approach
• Computer Science
ArXiv
• 2011
We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlapping groups of variables. We call the obtained formulation latent group
Structured sparsity through convex optimization
• Computer Science
ArXiv
• 2011
It is shown that the $\ell_1$-norm can be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures.
Group Regularized Estimation Under Structural Hierarchy
• Computer Science
• 2014
This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony and shows that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection.
Sparse estimation of large covariance matrices via a nested Lasso penalty
• Computer Science
• 2008
The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering, and imposes a banded structure on the Cholesky factor, using a novel penalty called nested Lasso, which results in a sparse estimators for the inverse of the covariance matrix.
The composite absolute penalties family for grouped and hierarchical variable selection
• Computer Science
• 2009
CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.
Convex Modeling of Interactions With Strong Heredity
• Computer Science, Mathematics
Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
• 2016
FAMILY is a generalization of several existing methods, such as VANISH, hierNet, the all-pairs lasso, and the lasso using only main effects, formulated as the solution to a convex optimization problem, which is solved using an efficient alternating directions method of multipliers (ADMM) algorithm.
Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials
• Computer Science
AISTATS
• 2010
This work uses a spectral projected gradient method as a subroutine for solving the overlapping group ‘1regularization problem, and makes use of a sparse version of Dykstra’s algorithm to compute the projection.
Proximal Methods for Hierarchical Sparse Coding
• Computer Science
J. Mach. Learn. Res.
• 2011
The procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the l1-norm.