RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs

@article{Fan2020RANKLI,
  title={RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs},
  author={Yingying Fan and Emre Demirkaya and Gaorong Li and Jinchi Lv},
  journal={Journal of the American Statistical Association},
  year={2020},
  volume={115},
  pages={362 - 379}
}
Abstract Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this article, we provide theoretical foundations on the power and robustness for the model-X knockoffs procedure introduced recently in Candès, Fan, Janson and Lv in high-dimensional setting when the covariate distribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the… 
High-Dimensional Knockoffs Inference for Time Series Data
TLDR
This paper proposes the method of time series knockoffs inference (TSKI), and suggests the new knockoff statistic, the backward elimination ranking (BE) statistic, and shows that it enjoys both the sure screening property and controlled FDR in the linear time series model setting.
Kernel Knockoffs Selection for Nonparametric Additive Models
TLDR
This article proposes a novel kernel knockoffs selection procedure for the nonparametric additive model, and shows that the proposed method is guaranteed to control the FDR under any finite sample size, and achieves a power that approaches one as the sample size tends to infinity.
Nodewise Knockoffs: False Discovery Rate Control for Gaussian Graphical Models
TLDR
This paper uses a sample-splitting-recycling procedure that first uses half of the sample to select hyperparameters, then learns the structure of the graph using all samples in a certain way such that the FDR control property still holds.
Power analysis of knockoff filters for correlated designs
TLDR
This work introduces Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse).
Error-based Knockoffs Inference for Controlled Feature Selection
TLDR
This paper proposes an error-based knockoff inference method that does not require specifying a regression model and can handle feature selection with theoretical guarantees on controlling false discovery proportion (FDP), FDR, or k-familywise error rate (k-FWER).
Submitted to the Annals of Statistics ROBUST INFERENCE WITH KNOCKOFFS By Rina
TLDR
The model-X knockoffs framework is robust to errors in the underlying assumptions on the distribution of X, making it an effective method for many practical applications, where the underlying distribution on the features X1, . . . , Xp is estimated accurately but not known exactly.
IPAD: Stable Interpretable Forecasting with Knockoffs Inference
TLDR
A new method of intertwined probabilistic factors decoupling (IPAD) for stable interpretable forecasting with knockoffs inference in high-dimensional models is suggested and has appealing finite-sample performance with desired interpretability and stability compared to some popularly used forecasting methods.
Model-Free Statistical Inference on High-Dimensional Data
: This paper aims to develop an effective model-free inference procedure for high-dimensional data. We first reformulate the hypothesis testing problem via sufficient dimension reduction framework. With
Large-scale model selection in misspecified generalized linear models
TLDR
The framework of model selection principles under the misspecified generalized linear models presented in Lv and Liu (2014) are exploited and the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecification is investigated.
Whiteout: when do fixed-X knockoffs fail?
TLDR
This work recast the fixed-X knockoff filter for the Gaussian linear model as a conditional post-selection inference method, and obtains the first negative results that universally upper-bound the power of all fixed- X knockoff methods, without regard to choices made by the analyst.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 101 REFERENCES
Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection
TLDR
This work proposes a new framework of ‘model‐X’ knockoffs, which reads from a different perspective the knockoff procedure that was originally designed for controlling the false discovery rate in linear models, and demonstrates the superior power of knockoffs through simulations.
A knockoff filter for high-dimensional selective inference
TLDR
It is proved that the high-dimensional knockoff procedure 'discovers' important variables as well as the directions (signs) of their effects, in such a way that the expected proportion of wrongly chosen signs is below the user-specified level.
Controlling the false discovery rate via knockoffs
TLDR
The knockoff filter is introduced, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables, and empirical results show that the resulting method has far more power than existing selection rules when the proportion of null variables is high.
Control of the False Discovery Rate Under Arbitrary Covariance Dependence
TLDR
This paper derives the theoretical distribution for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provides a consistent FDP, and proposes a new methodology based on principal factor approximation, which successfully substracts the common dependence and weakens signicantly the correlation structure.
Tuning-Free Heterogeneous Inference in Massive Networks
TLDR
This article exploits multiple networks with Gaussian graphs to encode the connectivity patterns of a large number of features on the subpopulations to suggest a framework of large-scale tuning-free heterogeneous inference, where the number of networks is allowed to diverge.
Elementary Estimators for High-Dimensional Linear Regression
TLDR
This paper addresses the problem of structurally constrained high-dimensional linear regression at the source, by asking whether one can build simpler possibly closed-form estimators, that yet come with statistical guarantees that are nonetheless comparable to regularized likelihood estimators.
Estimating False Discovery Proportion Under Arbitrary Covariance Dependence
TLDR
An approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and a consistent estimate of realized FDP is provided, which has important applications in controlling false discovery rate and FDP.
Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
TLDR
This article characterize the asymptotic equivalence of regularization methods, with general penalty functions, in a thresholded parameter space under the generalized linear model setting, where the dimensionality can grow exponentially with the sample size.
Ultrahigh Dimensional Variable Selection: beyond the linear model
TLDR
This paper extends ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case, and introduces a new technique to reduce the false discovery rate in the feature screening stage.
Adapting to unknown sparsity by controlling the false discovery rate
TLDR
This work provides a new perspective on a class of model selection rules which has been introduced recently by several authors, and exhibits a close connection with FDR-controlling procedures under stringent control of the false discovery rate.
...
1
2
3
4
5
...