• Corpus ID: 245650677

Cluster Stability Selection

  title={Cluster Stability Selection},
  author={Gregory Faletto and Jacob Bien},
Stability selection [Meinshausen and Bühlmann, 2010] makes any feature selection method more stable by returning only those features that are consistently selected across many subsamples. We prove (in what is, to our knowledge, the first result of its kind) that for data containing highly correlated proxies for an important latent variable, the lasso typically selects one proxy, yet stability selection with the lasso can fail to select any proxy, leading to worse predictive performance than the… 

Figures from this paper


Stability Feature Selection using Cluster Representative LASSO
This work proposes to cluster the variables first and then do stability feature selection using Lasso for cluster representatives and finds an optimal and consistent solution for group variable selection in high-dimensional regression setting.
Extensions of stability selection using subsamples of observations and covariates
We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and Bühlmann (J R Stat Soc 72:417–473, 2010). We propose to apply a base
Variable selection with error control: another look at stability selection
Summary.  Stability selection was recently introduced by Meinshausen and Bühlmann as a very general technique designed to improve the performance of a variable selection algorithm. It is based on
The Lasso Problem and Uniqueness
The LARS algorithm is extended to cover the non-unique case, so that this path algorithm works for any predictor matrix and a simple method is derived for computing the component-wise uncertainty in lasso solutions of any given problem instance, based on linear programming.
The Cluster Elastic Net for High-Dimensional Regression With Unknown Variable Grouping
This work proposes the cluster elastic net, which selectively shrinks the coefficients for such variables toward each other, rather than toward the origin, in the high-dimensional regression setting.
Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR
A new method called the OSCAR (octagonal shrinkage and clustering algorithm for regression) is proposed to simultaneously select variables while grouping them into predictive clusters, in addition to improving prediction accuracy and interpretation.
Regression Shrinkage and Selection via the Lasso
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
High-dimensional graphs and variable selection with the Lasso
It is shown that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs and is hence equivalent to variable selection for Gaussian linear models.
Stability selection for genome‐wide association
This article applies the recently proposed “stability selection” procedure of Meinshausen and Bühlmann to the problem of variable selection in genome‐wide association. In particular, it explores