• Corpus ID: 15182196

Ultrahigh Dimensional Variable Selection: beyond the linear model

@article{Fan2008UltrahighDV,
  title={Ultrahigh Dimensional Variable Selection: beyond the linear model},
  author={Jianqing Fan and Richard J. Samworth and Yichao Wu},
  journal={arXiv: Methodology},
  year={2008}
}
Variable selection in high-dimensional space characterizes many contemporary prob- lems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking or feature selection using a two- sample t-test in high-dimensional classification. Within the context of the linear model, Fan and Lv (2008) showed that this simple correlation ranking possesses a sure independence screen- ing property under certain… 

New developments of dimension reduction

TLDR
This thesis proposes a novel model-free variable selection method to deal with multi-population data by incorporating the grouping information and demonstrates that this method greatly outperforms CSIS for nonlinear models.

Sure independence screening for ultrahigh dimensional feature space

TLDR
The concept of sure screening is introduced and a sure screening method that is based on correlation learning, called sure independence screening, is proposed to reduce dimensionality from high to a moderate scale that is below the sample size.

Robust rank correlation based screening

Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p,

Applications of penalized likelihood methods for feature selection in statistical modeling

TLDR
This dissertation develops approaches based on PLMs to deal with the issues of feature selection arising from several application fields, and proposes a novel screening approach via the sparsity-restricted maximum likelihood estimator that removes most of the irrelevant features before the formal selection.

A Selective Overview of Variable Selection in High Dimensional Feature Space.

TLDR
A brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection is presented and the properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized.

Group screening for ultra-high-dimensional feature under linear model

TLDR
This paper proposes a group screening method to do variables selection on groups of variables in linear models based on a working independence, and sure screening property is also established for this approach.

On marginal sliced inverse regression for ultrahigh dimensional model-free feature selection

TLDR
This paper extends the marginal coordinate test for sliced inverse regression (SIR) in Cook (2004) and proposes a novel marginal SIR utility for the purpose of ultrahigh dimensional feature selection and ignores the correlation among the predictors and proposes marginal independence SIR.

The Sparse MLE for Ultrahigh-Dimensional Feature Screening

TLDR
This article proposes a new screening method via the sparsity-restricted maximum likelihood estimator (SMLE), which naturally takes the joint effects of features in the screening process, which gives itself an edge to potentially outperform the existing methods.

Marginal and Interactive Feature Screening of Ultra-high Dimensional Feature Spaces with Multivariate Response

TLDR
This work proposes a new method (GenCorr) that admits a multivariate response, which allows us to more appropriately model multiple responses as a single unit, rather than as unrelated entities, which avails more robust analyses in relation to complex traits embedded in the covariance structure of multiple responses.

High Dimensional Variable Selection with Error Control

TLDR
It is demonstrated that the variable selection methods with the sequential use of FDR and ISIS not only controlled the predefined FDR in the final models but also had relatively high AUROC scores.
...

References

SHOWING 1-10 OF 44 REFERENCES

Sure independence screening for ultrahigh dimensional feature space

TLDR
The concept of sure screening is introduced and a sure screening method that is based on correlation learning, called sure independence screening, is proposed to reduce dimensionality from high to a moderate scale that is below the sample size.

On Model Selection Consistency of Lasso

TLDR
It is proved that a single condition, which is called the Irrepresentable Condition, is almost necessary and sufficient for Lasso to select the true model both in the classical fixed p setting and in the large p setting as the sample size n gets large.

"Preconditioning" for feature selection and regression in high-dimensional problems

TLDR
This work proposes a method for variable selection that first estimates the regression function, yielding a "preconditioned" response variable, and shows that under a certain Gaussian latent variable model, application of the LASSO to the preconditioned response variable is consistent as the number of predictors and observations increases.

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

TLDR
In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.

The sparsity and bias of the Lasso selection in high-dimensional linear regression

Meinshausen and Buhlmann [Ann. Statist. 34 (2006) 1436-1462] showed that, for neighborhood selection in Gaussian graphical models, under a neighborhood stability condition, the LASSO is consistent,

High-dimensional graphs and variable selection with the Lasso

TLDR
It is shown that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs and is hence equivalent to variable selection for Gaussian linear models.

PENALIZED LINEAR UNBIASED SELECTION

TLDR
It is proved that for a universal penalty level, the MC+ has high probability of correct selection under much weaker conditions compared with existing results for the LASSO for large n and p, including the case of p ≫ n.

Regression Shrinkage and Selection via the Lasso

TLDR
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

Nonconcave penalized likelihood with a diverging number of parameters

A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed by Fan and Li to simultaneously estimate parameters and select important variables.

The Adaptive Lasso and Its Oracle Properties

TLDR
A new version of the lasso is proposed, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the ℓ1 penalty, and the nonnegative garotte is shown to be consistent for variable selection.