Sequential interaction group selection by the principle of correlation search for high-dimensional interaction models

  title={Sequential interaction group selection by the principle of correlation search for high-dimensional interaction models},
  author={Shan Luo and Zehua Chen},
  journal={Statistica Sinica},
s0 — the set of all relevant simple features; s0 — the set of relevant simple features contained in Z∗ 0 ; ΣZjk — the covariance matrix of Zjk = (Xj , Xk, XjXk); Ỹ (Z∗ 0 ) = Y −α−Σy,z∗ 0 Σ −1 z∗ 0 ,z ∗ 0 Z∗ 0 — the residual of Y adjusting for the effects of Z∗ 0 , that is, the difference between Y and its best linear predictor in terms of Z∗ 0 . Σy,zjk(Z 0 ) — the covariance vector between Ỹ (Z∗ 0 ) and Zjk; R(Ỹ (Z∗ 0 ),Zjk) — the multiple correlation coefficient between the Ỹ (z∗ 0) and Zjk… 

Tables from this paper

Two-level Bayesian interaction analysis for survival data incorporating pathway information

This approach is the first to conduct the analysis of lower-level gene-gene interactions and higher-level pathway-pathway interactions simultaneously, and proposes a variational inference framework based on the accelerated failure time model with favourable priors to account for two-level selection as well as censoring.

Automatic variable selection in a linear model on massive data

For a linear model on massive data, an aggregated estimator depending on adaptive LASSO estimators is proposed, which allows the reduction of the data storage volume and the introduction of new estimators.




A precise characterization of the effect of this hierarchy constraint is given, a bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint, and it is proved that hierarchy holds with probability one.

Group Regularized Estimation Under Structural Hierarchy

This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony and shows that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection.

Model Selection for High-Dimensional Quadratic Regression via Regularization

A new regularization method, called regularization algorithm under marginality principle (RAMP), is proposed to compute a hierarchy-preserving regularization solution path efficiently and both methods are further extended to solve generalized QR models.

Forward Regression for Ultra-High Dimensional Variable Screening

The theoretical analysis reveals that FR can identify all relevant predictors consistently, even if the predictor dimension is substantially larger than the sample size, if the dimension of the true model is finite.

Model selection and estimation in regression with grouped variables

Summary.  We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor

Regression Shrinkage and Selection via the Lasso

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

A Two-Stage Penalized Logistic Regression Approach to Case-Control Genome-Wide Association Studies

We propose a two-stage penalized logistic regression approach to case-control genome-wide association studies. This approach consists of a screening stage and a selection stage. In the screening

Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space

It is shown that, with probability converging to 1, the SLasso first selects all the relevant features before any irrelevant features can be selected, and that the EBIC decreases until it attains the minimum at the model consisting of exactly all therelevant features and then begins to increase, which establishes the selection consistency of SLasso.

The composite absolute penalties family for grouped and hierarchical variable selection

CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.