Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn

@article{Phipson2010PermutationPS,
title={Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn},
author={Belinda Phipson and Gordon K. Smyth},
journal={Statistical Applications in Genetics and Molecular Biology},
year={2010},
volume={9}
}
• Published 31 October 2010
• Mathematics
• Statistical Applications in Genetics and Molecular Biology
Permutation tests are amongst the most commonly used statistical tools in modern genomic research, a process by which p-values are attached to a test statistic by randomly permuting the sample or gene labels. Yet permutation p-values published in the genomic literature are often computed incorrectly, understated by about 1/m, where m is the number of permutations. The same is often true in the more general situation when Monte Carlo simulation is used to assign p-values. Although the p-value…
426 Citations

Figures and Tables from this paper

Fast approximation of small p‐values in permutation tests by partitioning the permutations

• Mathematics
Biometrics
• 2018
Through simulations, the proposed asymptotic approximation and resampling algorithm is more computationally efficient than another leading alternative, particularly for extremely small p-values, and through application to cancer genomic data, it is found that the methods can successfully identify up- and down-regulated genes.

Optimal allocation of samples for Monte-Carlo based multiple testing and comparison to Thompson Sampling

• Mathematics
• 2015
Multiple testing is often carried out in practice using approximated p-values obtained, for instance, via bootstrap or permutation tests. We are interested in allocating a pre-specified total number

Valid Monte Carlo Permutation Tests for Genetic Case‐Control Studies With Missing Genotypes

• Mathematics
Genetic epidemiology
• 2014
A rigorous theoretical framework for verifying the validity of Monte Carlo permutation tests when there are missing genotypes is developed and results are verified and supplemented by simulations for a variety of missing data processes and test statistics.

QuickMMCTest - Higher accuracy for multiple testing corrections

• Mathematics
• 2014
QuickMMCTest is introduced, a new heuristic method to assess the statistical significance of multiple hypotheses and is compared to three other methods in a simulation study, namely to a naive approach which draws a constant number of replicates or permutations for each hypothesis, to the recent MCFDR algorithm and to the MMCTest algorithm.

Exact testing with random permutations

• Mathematics, Computer Science
Test
• 2018
This paper provides an alternative proof, viewing the test as a “conditional Monte Carlo test” as it has been called in the literature, and results can be used to prove properties of various multiple testing procedures based on random permutations.

Robust methods to detect disease-genotype association in genetic association studies: calculate p-values using exact conditional enumeration instead of simulated permutations or asymptotic approximations

• Mathematics
Statistical applications in genetics and molecular biology
• 2014
If all monotone genetic models are of interest, the best performance in the situations under study is achieved for the robust test statistics based on the maximum over a range of Cochran-Armitage trend tests with different scores and for the constrained likelihood ratio test.

Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits

• Computer Science
ICML
• 2019
This paper proposes Adaptive MC multiple Testing (AMT) to estimate MC p-values and control false discovery rate in multiple testing, which outputs the same result as the standard full MC approach with high probability while requiring only $\tilde{O}(\sqrt{n}m)$ samples.

An adaptive permutation approach for genome-wide association study: evaluation and recommendations for use

• Computer Science
BioData Mining
• 2013
The current study provides evidence of the validity of the approach, and importantly provides guidance on the proper implementation of such a strategy, and tools are made available to aid investigators in implementing these approaches.

MERIT: controlling Monte-Carlo error rate in large-scale Monte-Carlo hypothesis testing

• Computer Science
bioRxiv
• 2022
MERIT (Monte-Carlo Error Rate control In large-scale MC hypothesis Testing), a method for large- scale MC hypothesis testing that also controls the MCER but is more statistically efficient than the GH method, is proposed.

References

SHOWING 1-10 OF 23 REFERENCES

ROAST: rotation gene set tests for complex microarray experiments

• Mathematics
Bioinform.
• 2010
ROAST is a statistically rigorous gene set test that allows for gene-wise correlation while being applicable to almost any experimental design, and uses rotation, a Monte Carlo technology for multivariate regression, instead of permutation.

Rotation testing in gene set enrichment analysis for small direct comparison experiments.

• Mathematics
Statistical applications in genetics and molecular biology
• 2009
The proposed rotation test is a generalisation of the permutation test, and can in addition be used on indirect comparison data and for testing significance of other types of test statistics outside the GSEA framework.

Rotation testing in gene set enrichment analysis for small direct comparison experiments.

• Mathematics
• 2009
Gene Set Enrichment Analysis (GSEA) is a method for analysing gene expression data with a focus on a priori defined gene sets. The permutation test generally used in GSEA for testing the significance

Permutation Methods: A Basis for Exact Inference

The reasoning behind permutation methods for exact inference is discussed and situations when they are exact and distribution-free are described.

Introduction to Modern Nonparametric Statistics

1. ONE-SAMPLE METHODS. Preliminaries. A Nonparametric Test and Confidence Interval for the Median. Estimating the Population CDF and Quantiles. A Comparison of Statistical Tests. 2. TWO-SAMPLE

Analyzing gene expression data in terms of gene sets: methodological issues

• Biology
Bioinform.
• 2007
It is argued that methods that competitively test each gene set against the rest of the genes create an unnecessary rift between single gene testing and gene set testing.

Bootstrap Methods and their Application

• Computer Science
• 1997
This book gives a broad and up-to-date coverage of bootstrap methods, with numerous applied examples, developed in a coherent way with the necessary theoretical basis, including improved Monte Carlo simulation.

Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses

This book provides a step-by-step manual on the application of permutation tests in biology, medicine, science, and engineering and shows how the problems of missing and censored data, nonresponders, after thefact covariates, and outliers may be handled.

Randomization, Bootstrap and Monte Carlo Methods in Biology

Preface to the Second Edition Preface to the First Edition Randomization The Idea of a Randomization Test Examples of Randomization Tests Aspects of Randomization Testing Raised by the Examples

The Design of Experiments

• J. I
• Economics
Nature
• 1936
AbstractREADERS of “Statistical Methods for Research Workers” will welcome Prof. Fisher's new book, which is partly devoted to a development of the logical ideas underlying the earlier volume and