# RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs

@article{Fan2020RANKLI, title={RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs}, author={Yingying Fan and Emre Demirkaya and Gaorong Li and Jinchi Lv}, journal={Journal of the American Statistical Association}, year={2020}, volume={115}, pages={362 - 379} }

Abstract Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this article, we provide theoretical foundations on the power and robustness for the model-X knockoffs procedure introduced recently in Candès, Fan, Janson and Lv in high-dimensional setting when the covariate distribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the…

## 46 Citations

High-Dimensional Knockoffs Inference for Time Series Data

- Computer Science
- 2021

This paper proposes the method of time series knockoffs inference (TSKI), and suggests the new knockoff statistic, the backward elimination ranking (BE) statistic, and shows that it enjoys both the sure screening property and controlled FDR in the linear time series model setting.

Kernel Knockoffs Selection for Nonparametric Additive Models

- Mathematics, Computer ScienceJournal of the American Statistical Association
- 2022

This article proposes a novel kernel knockoffs selection procedure for the nonparametric additive model, and shows that the proposed method is guaranteed to control the FDR under any finite sample size, and achieves a power that approaches one as the sample size tends to infinity.

Nodewise Knockoffs: False Discovery Rate Control for Gaussian Graphical Models

- Computer Science
- 2019

This paper uses a sample-splitting-recycling procedure that first uses half of the sample to select hyperparameters, then learns the structure of the graph using all samples in a certain way such that the FDR control property still holds.

Power analysis of knockoff filters for correlated designs

- Computer ScienceNeurIPS
- 2019

This work introduces Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse).

Error-based Knockoffs Inference for Controlled Feature Selection

- Computer ScienceArXiv
- 2022

This paper proposes an error-based knockoff inference method that does not require specifying a regression model and can handle feature selection with theoretical guarantees on controlling false discovery proportion (FDP), FDR, or k-familywise error rate (k-FWER).

Submitted to the Annals of Statistics ROBUST INFERENCE WITH KNOCKOFFS By Rina

- Mathematics, Computer Science
- 2019

The model-X knockoffs framework is robust to errors in the underlying assumptions on the distribution of X, making it an effective method for many practical applications, where the underlying distribution on the features X1, . . . , Xp is estimated accurately but not known exactly.

IPAD: Stable Interpretable Forecasting with Knockoffs Inference

- Computer ScienceJournal of the American Statistical Association
- 2020

A new method of intertwined probabilistic factors decoupling (IPAD) for stable interpretable forecasting with knockoffs inference in high-dimensional models is suggested and has appealing finite-sample performance with desired interpretability and stability compared to some popularly used forecasting methods.

Model-Free Statistical Inference on High-Dimensional Data

- Mathematics
- 2022

: This paper aims to develop an eﬀective model-free inference procedure for high-dimensional data. We ﬁrst reformulate the hypothesis testing problem via suﬃcient dimension reduction framework. With…

Large-scale model selection in misspecified generalized linear models

- Computer ScienceBiometrika
- 2021

The framework of model selection principles under the misspecified generalized linear models presented in Lv and Liu (2014) are exploited and the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecification is investigated.

Whiteout: when do fixed-X knockoffs fail?

- Computer Science
- 2021

This work recast the fixed-X knockoff filter for the Gaussian linear model as a conditional post-selection inference method, and obtains the first negative results that universally upper-bound the power of all fixed- X knockoff methods, without regard to choices made by the analyst.

## References

SHOWING 1-10 OF 101 REFERENCES

Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection

- Computer Science
- 2016

This work proposes a new framework of ‘model‐X’ knockoffs, which reads from a different perspective the knockoff procedure that was originally designed for controlling the false discovery rate in linear models, and demonstrates the superior power of knockoffs through simulations.

A knockoff filter for high-dimensional selective inference

- Computer Science, MathematicsThe Annals of Statistics
- 2019

It is proved that the high-dimensional knockoff procedure 'discovers' important variables as well as the directions (signs) of their effects, in such a way that the expected proportion of wrongly chosen signs is below the user-specified level.

Controlling the false discovery rate via knockoffs

- Computer Science
- 2015

The knockoff filter is introduced, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables, and empirical results show that the resulting method has far more power than existing selection rules when the proportion of null variables is high.

Control of the False Discovery Rate Under Arbitrary Covariance Dependence

- Computer Science
- 2010

This paper derives the theoretical distribution for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provides a consistent FDP, and proposes a new methodology based on principal factor approximation, which successfully substracts the common dependence and weakens signicantly the correlation structure.

Tuning-Free Heterogeneous Inference in Massive Networks

- Computer ScienceJournal of the American Statistical Association
- 2019

This article exploits multiple networks with Gaussian graphs to encode the connectivity patterns of a large number of features on the subpopulations to suggest a framework of large-scale tuning-free heterogeneous inference, where the number of networks is allowed to diverge.

Elementary Estimators for High-Dimensional Linear Regression

- Computer Science, MathematicsICML
- 2014

This paper addresses the problem of structurally constrained high-dimensional linear regression at the source, by asking whether one can build simpler possibly closed-form estimators, that yet come with statistical guarantees that are nonetheless comparable to regularized likelihood estimators.

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

- Computer ScienceJournal of the American Statistical Association
- 2012

An approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and a consistent estimate of realized FDP is provided, which has important applications in controlling false discovery rate and FDP.

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space

- Mathematics, Computer Science
- 2013

This article characterize the asymptotic equivalence of regularization methods, with general penalty functions, in a thresholded parameter space under the generalized linear model setting, where the dimensionality can grow exponentially with the sample size.

Ultrahigh Dimensional Variable Selection: beyond the linear model

- Computer Science
- 2008

This paper extends ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case, and introduces a new technique to reduce the false discovery rate in the feature screening stage.

Adapting to unknown sparsity by controlling the false discovery rate

- Computer Science
- 2005

This work provides a new perspective on a class of model selection rules which has been introduced recently by several authors, and exhibits a close connection with FDR-controlling procedures under stringent control of the false discovery rate.