# Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations

@article{Yan2017HierarchicalSM, title={Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations}, author={Xiaohan Yan and Jacob Bien}, journal={Statistical Science}, year={2017}, volume={32}, pages={531-560} }

Demanding sparsity in estimated models has become a routine
practice in statistics. In many situations, we wish to require that the sparsity
patterns attained honor certain problem-specific constraints. Hierarchical
sparse modeling (HSM) refers to situations in which these constraints
specify that one set of parameters be set to zero whenever another is set to
zero. In recent years, numerous papers have developed convex regularizers
for this form of sparsity structure, which arises in…

## Figures and Tables from this paper

## 37 Citations

### Learning Local Dependence In Ordered Data

- Computer ScienceJ. Mach. Learn. Res.
- 2017

This work proposes a framework for learning local dependence based on estimating the inverse of the Cholesky factor of the covariance matrix, which yields a simple regression interpretation for local dependence in which variables are predicted by their neighbors.

### A First-Order Optimization Algorithm for Statistical Learning with Hierarchical Sparsity Structure

- Computer ScienceINFORMS J. Comput.
- 2022

A computationally efficient optimization algorithm to evaluate the proximal operator of a nonsmooth hierarchical sparsity-inducing regularizer and establishes its convergence properties is proposed and performance is demonstrated on two statistical learning applications related to topic modeling and breast cancer classification.

### Graph-Guided Banding of the Covariance Matrix

- MathematicsJournal of the American Statistical Association
- 2018

This work proposes a generalization of the notion of bandedness that greatly expands the range of problems in which banded estimators apply, and develops convex regularizers occupying the broad middle ground between the former approach of “patternless sparsity” and the latter reliance on having a known ordering.

### A Pliable Lasso

- Computer ScienceJournal of Computational and Graphical Statistics
- 2019

A generalization of the lasso that allows the model coefficients to vary as a function of a general set of some prespecified modifying variables, which might be variables such as gender, age, or time is proposed.

### LassoNet: Neural Networks with Feature Sparsity

- Computer ScienceAISTATS
- 2021

This work introduces LassoNet, a neural network framework with global feature selection that uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly, and delivers an entire regularization path of solutions with a range of feature sparsity.

### Learning With Subquadratic Regularization : A Primal-Dual Approach

- Computer Science, MathematicsIJCAI
- 2020

This paper describes the efficiency of the algorithms developed in the context of tree-structured sparsity, where they comprehensively outperform relevant baselines and achieve a proven rate of convergence of O(1/T ) after T iterations.

### Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

- Computer ScienceAISTATS
- 2020

A convex relaxation which enforces strong hierarchy is studied and a highly scalable algorithm based on proximal gradient descent is developed and a specialized active-set strategy with gradient screening for avoiding costly gradient computations is introduced.

### Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages

- Computer Science, EconomicsJournal of the American Statistical Association
- 2021

Novel theoretical machinery includes non-asymptotic analysis of infinite-order VAR, elastic net estimation under a singular covariance structure of regressors, and new concentration inequalities for quadratic forms of random variables from Gaussian time series.

### Learning with Structured Sparsity: From Discrete to Convex and Back

- Computer Science
- 2018

A new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization are introduced, and an in-depth study of the geometric and statistical properties of convex Relaxations of general combinatorial structures is presented.

### Structural regularization in quadratic logistic regression model

- Computer ScienceKnowl. Based Syst.
- 2019

## References

SHOWING 1-10 OF 36 REFERENCES

### A LASSO FOR HIERARCHICAL INTERACTIONS.

- Computer ScienceAnnals of statistics
- 2013

A precise characterization of the effect of this hierarchy constraint is given, a bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint, and it is proved that hierarchy holds with probability one.

### A Sparse-Group Lasso

- Computer Science
- 2013

A regularized model for linear regression with ℓ1 andℓ2 penalties is introduced and it is shown that it has the desired effect of group-wise and within group sparsity.

### Proximal methods for the latent group lasso penalty

- Computer ScienceComput. Optim. Appl.
- 2014

This paper considers a regularized least squares problem, with regularization by structured sparsity-inducing norms, which extend the usual ℓ1 and the group lasso penalty, by allowing the subsets to overlap, and devise a new active set strategy that allows to deal with high dimensional problems without pre-processing for dimensionality reduction.

### Group Lasso with Overlaps: the Latent Group Lasso approach

- Computer ScienceArXiv
- 2011

We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlapping groups of variables. We call the obtained formulation latent group…

### Structured sparsity through convex optimization

- Computer ScienceArXiv
- 2011

It is shown that the $\ell_1$-norm can be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures.

### Group Regularized Estimation Under Structural Hierarchy

- Computer Science
- 2014

This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony and shows that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection.

### Sparse estimation of large covariance matrices via a nested Lasso penalty

- Computer Science
- 2008

The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering, and imposes a banded structure on the Cholesky factor, using a novel penalty called nested Lasso, which results in a sparse estimators for the inverse of the covariance matrix.

### The composite absolute penalties family for grouped and hierarchical variable selection

- Computer Science
- 2009

CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.

### Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions

- Computer Science
- 2010

This work introduces a new approach, “Variable selection using Adaptive Nonlinear Interaction Structures in High dimensions” (VANISH), that is based on a penalized least squares criterion and is designed for high dimensional nonlinear problems and suggests that VANISH should outperform certain natural competitors when the true interaction structure is sufficiently sparse.

### Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials

- Computer ScienceAISTATS
- 2010

This work uses a spectral projected gradient method as a subroutine for solving the overlapping group ‘1regularization problem, and makes use of a sparse version of Dykstra’s algorithm to compute the projection.