Corpus ID: 233241190

Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives

  title={Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives},
  author={Hussein Hazimeh and R. Mazumder and P. Radchenko},
We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization. While there exist several appealing approaches based on convex relaxations and nonconvex heuristics, we focus on optimal solutions for the `0regularized formulation, a problem that is relatively unexplored due to computational challenges. Our methodology covers both high-dimensional linear regression and nonparametric sparse additive modeling with smooth components. Our… Expand

Figures and Tables from this paper

Group selection and shrinkage with application to sparse semiparametric modeling
Sparse regression and classification estimators capable of group selection have application to an assortment of statistical problems, from multitask learning to sparse additive modeling toExpand
Predicting Census Survey Response Rates via Interpretable Nonparametric Additive Models with Structured Interactions
Accurate and interpretable prediction of survey response rates is important from an operational standpoint. The US Census Bureau’s well-known ROAM application uses principled statistical modelsExpand
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives
A discrete optimization based approach for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features, which leads to models with considerably improved statistical performance when compared to competing toolkits. Expand


Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms
This paper empirically demonstrate that a family of L_0-based estimators can outperform the state-of-the-art sparse learning algorithms in terms of a combination of prediction, estimation, and variable selection metrics under various regimes (e.g., different signal strengths, feature correlations, number of samples and features). Expand
The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
A novel high-dimensional linear regression estimator: therete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals, is proposed. Expand
Regularization vs. Relaxation: A conic optimization perspective of statistical variable selection
This paper shows that a popular sparsity-inducing concave penalty function known as the Minimax Concave Penalty (MCP), and the reverse Huber penalty derived in a recent work by Pilanci, Wainwright and Ghaoui, can both be derived as special cases of a lifted convex relaxation called the perspective relaxation. Expand
Mixed-integer nonlinear optimization*†
An emerging area of mixed-integer optimal control that adds systems of ordinary differential equations to MINLP is described and a range of approaches for tackling this challenging class of problems are discussed, including piecewise linear approximations, generic strategies for obtaining convex relaxations for non-convex functions, spatial branch-and-bound methods, and a small sample of techniques that exploit particular types of non- Convex structures to obtain improved convex Relaxations. Expand
Best Subset Selection via a Modern Optimization Lens
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving MixedExpand
Sparse high-dimensional regression: Exact scalable algorithms and phase transitions
We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solveExpand
Perspective reformulations of mixed integer nonlinear programs with indicator variables
Computational results for three applications—quadratic facility location, network design with congestion, and portfolio optimization with buy-in thresholds—that show the power of the reformulation technique are presented. Expand
Sparsity Constrained Nonlinear Optimization: Optimality Conditions and Algorithms
This paper treats the problem of minimizing a general continuously differentiable function subject to sparsity constraints. We present and analyze several different optimality criteria which areExpand
Statistical Learning with Sparsity: The Lasso and Generalizations
Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data and extract useful and reproducible patterns from big datasets. Expand
Learning Hierarchical Interactions at Scale: A Convex Optimization Approach
A convex relaxation which enforces strong hierarchy is studied and a highly scalable algorithm based on proximal gradient descent is developed and a specialized active-set strategy with gradient screening for avoiding costly gradient computations is introduced. Expand