• Corpus ID: 238407835

Deploying the Conditional Randomization Test in High Multiplicity Problems

@inproceedings{Li2021DeployingTC,
  title={Deploying the Conditional Randomization Test in High Multiplicity Problems},
  author={Shuangning Li and Emmanuel J. Cand{\`e}s},
  year={2021}
}
This paper introduces the sequential CRT, which is a variable selection procedure that combines the conditional randomization test (CRT) and Selective SeqStep+. Valid p -values are constructed via the flexible CRT, which are then ordered and passed through the selective SeqStep+ filter to produce a list of discoveries. We develop theory guaranteeing control on the false discovery rate (FDR) even though the p -values are not independent. We show in simulations that our novel procedure indeed… 
1 Citations

Individualized conditional independence testing under model-X with heterogeneous samples and interactions

Model-X knockoffs and the conditional randomization test are methods that search for conditional associations in large data sets, controlling the type-I errors if the joint distribution of the

References

SHOWING 1-10 OF 39 REFERENCES

Controlling the false discovery rate via knockoffs

The knockoff filter is introduced, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables, and empirical results show that the resulting method has far more power than existing selection rules when the proportion of null variables is high.

Fast and Powerful Conditional Randomization Testing via Distillation

Thedistilled~CRT is proposed, a novel approach to using state-of-the-art machine learning algorithms in the CRT while drastically reducing the number of times those algorithms need to be run, thereby taking advantage of their power and theCRT's statistical guarantees without suffering the usual computational expense.

Discussion: An estimate of the science-wise false discovery rate and applications to top medical journals by Jager and Leek.

Prof. Jager and Prof. Leek took it upon themselves to find what actually is happening in current medical research if the authors assume that the practice is to publish only the findings significant at the 0.05 level, and utilized FDR methods that are being used in a single research project to study the phenomenon across studies.

Gene hunting with hidden Markov model knockoffs

An exact and efficient algorithm is developed to sample knockoff variables in this setting and it is argued that, combined with the existing selective framework, this provides a natural and powerful tool for inference in genome‐wide association studies with guaranteed false discovery rate control.

Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection

This work proposes a new framework of ‘model‐X’ knockoffs, which reads from a different perspective the knockoff procedure that was originally designed for controlling the false discovery rate in linear models, and demonstrates the superior power of knockoffs through simulations.

The Holdout Randomization Test for Feature Selection in Black Box Models

The holdout randomization test (HRT), an approach to feature selection using black box predictive models, is applied to two case studies from the scientific literature where heuristics were originally used to select important features for predictive models.

Derandomizing Knockoffs.

The derandomization step is designed to be flexible and can be adapted to any variable selection base procedure to yield stable decisions without compromising statistical power and is applied to multi-stage genome-wide association studies of prostate cancer and finds that the reported associations have been replicated.

A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics

We introduce tools for controlled variable selection to economists. In particular, we apply a recently introduced aggregation scheme for false discovery rate (FDR) control to German administrative

Causal inference in genetic trio studies

This work introduces a method to draw causal inferences—inferences immune to all possible confounding—from genetic data that include parents and offspring that is based only on a well-established mathematical model of recombination and make no assumptions about the relationship between the genotypes and phenotypes.

Loss-of-function variants in CTNNA1 detected on multigene panel testing in individuals with gastric or breast cancer

De-identified data from 151,425 individuals who underwent CTNNA1 testing at a commercial laboratory between October 2015 and July 2019 were reviewed and Immunohistochemistry showed decreased α-E-catenin expression in gastric cancers.