• Corpus ID: 238634585

The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control

  title={The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control},
  author={Jasin Machkour and Michael Muma and Daniel P{\'e}rez Palomar},
We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original predictors and multiple sets of randomly generated knockoff predictors. A finite sample proof based on… 


Controlling the false discovery rate via knockoffs
In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated
A knockoff filter for high-dimensional selective inference
This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In
Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection
A new framework of model-X knockoffs is proposed, which reads from a different perspective the knockoff procedure, originally designed for controlling the false discovery rate in linear models, and demonstrates the superior power of knockoffs through simulations.
p-Values for High-Dimensional Regression
Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid
Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach
Summary. The false discovery rate (FDR) is a multiple hypothesis testing quantity that describes the expected proportion of false positive results among all rejected null hypotheses. Benjamini and
Gene hunting with hidden Markov model knockoffs
An exact and efficient algorithm is developed to sample knockoff variables in this setting and it is argued that, combined with the existing selective framework, this provides a natural and powerful tool for inference in genome‐wide association studies with guaranteed false discovery rate control.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
SUMMARY The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to
Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR
Variable selection with error control: another look at stability selection
Summary. Stability selection was recently introduced by Meinshausen and Buhlmann as a very general technique designed to improve the performance of a variable selection algorithm. It is based on
False Discoveries Occur Early on the Lasso Path
It is demonstrated that true features and null features are always interspersed on the Lasso path, and that this phenomenon occurs no matter how strong the effect sizes are.