The False Positive Risk: A Proposal Concerning What to Do About p-Values

@article{Colquhoun2019TheFP,
  title={The False Positive Risk: A Proposal Concerning What to Do About p-Values},
  author={David Colquhoun},
  journal={The American Statistician},
  year={2019},
  volume={73},
  pages={192 - 201}
}
  • D. Colquhoun
  • Published 13 February 2018
  • Mathematics, Psychology
  • The American Statistician
Abstract It is widely acknowledged that the biomedical literature suffers from a surfeit of false positive results. Part of the reason for this is the persistence of the myth that observation of p < 0.05 is sufficient justification to claim that you have made a discovery. It is hopeless to expect users to change their reliance on p-values unless they are offered an alternative way of judging the reliability of their conclusions. If the alternative method is to have a chance of being adopted… 
The reproducibility of research and the misinterpretation of p-values
  • D. Colquhoun
  • Mathematics, Medicine
    Royal Society Open Science
  • 2017
TLDR
It is recommended that the terms ‘ significant’ and ‘non-significant’ should never be used and p-values should be supplemented by specifying the prior probability that would be needed to produce a specified false positive risk.
Validating new discoveries in sports medicine: we need FAIR play beyond p values
TLDR
FAIR: a four-item approach to help validate new discovery in sports medicine and shows how a study’s FPR can be high, even when the corresponding p values are low.
P-values – a chronic conundrum
  • Jian Gao
  • Medicine, Psychology
    BMC Medical Research Methodology
  • 2020
TLDR
This paper is intended to elucidate the p - value confusion from its root, to explicate the difference between significance and hypothesis testing, to illuminate the consequences of the confusion, and to present a viable alternative to the conventional p -value.
How feasible is it to abandon statistical significance? A reflection based on a short survey
Background There is a growing trend in using the “statistically significant” term in the scientific literature. However, harsh criticism of this concept motivated the recommendation to withdraw its
Statistically significant? Let us recognize that estimates of tested effects are uncertain
Haven’t all of us dreamt of concluding that our results be statistically significant, that is, characterized by a p-value lying below an arbitrary threshold, most often [Formula: see text]? In this
Reverse-Bayes methods: a review of recent technical advances
It is now widely accepted that the standard inferential toolkit used by the scientific research community – null-hypothesis significance testing (NHST) – is not fit for purpose. Yet despite the
Reverse-Bayes methods for evidence assessment and research synthesis.
TLDR
It is argued that Reverse-Bayes methods have a key role to play in making Bayesian methods more accessible and attractive for evidence assessment and research synthesis.
Null Hypothesis Significance Testing Defended and Calibrated by Bayesian Model Checking
Abstract Significance testing is often criticized because p-values can be low even though posterior probabilities of the null hypothesis are not low according to some Bayesian models. Those models,
Moving Towards the Post p < 0.05 Era via the Analysis of Credibility
ABSTRACT It is now widely accepted that the techniques of null hypothesis significance testing (NHST) are routinely misused and misinterpreted by researchers seeking insight from data. There is,
The new normal? Redaction bias in biomedical science
TLDR
It is demonstrated that the removal of a surprisingly small number of data points can be used to dramatically alter a result, and the impact of inappropriate redaction beyond a threshold value in biomedical science is formally quantified.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 58 REFERENCES
The reproducibility of research and the misinterpretation of p-values
  • D. Colquhoun
  • Mathematics, Medicine
    Royal Society Open Science
  • 2017
TLDR
It is recommended that the terms ‘ significant’ and ‘non-significant’ should never be used and p-values should be supplemented by specifying the prior probability that would be needed to produce a specified false positive risk.
An investigation of the false discovery rate and the misinterpretation of p-values
  • D. Colquhoun
  • Computer Science, Medicine
    Royal Society Open Science
  • 2014
TLDR
It is concluded that if you wish to keep your false discovery rate below 5%, you need to use a three-sigma rule, or to insist on p≤0.001, and never use the word ‘significant’.
Assessing the Probability That a Positive Report is False: An Approach for Molecular Epidemiology Studies
TLDR
This commentary shows how to assess the FPRP and how to use it to decide whether a finding is deserving of attention or "noteworthy" and shows how this approach can lead to improvements in the design, analysis, and interpretation of molecular epidemiology studies.
Assessing the probability that a positive report is false: an approach for molecular epidemiology studies.
TLDR
This commentary shows how to assess the FPRP and how to use it to decide whether a finding is deserving of attention or "noteworthy" and shows how this approach can lead to improvements in the design, analysis, and interpretation of molecular epidemiology studies.
p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate.
  • S. Goodman
  • Psychology, Medicine
    American journal of epidemiology
  • 1993
TLDR
An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis.
Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy
  • S. Goodman
  • Medicine
    Annals of Internal Medicine
  • 1999
TLDR
The historical and logical foundations of the dominant school of medical statistics, sometimes referred to as frequentist statistics, are explored and the logical fallacy at the heart of this system is explicated, which maintains such a tenacious hold on the minds of investigators, policymakers, and journal editors.
Toward Evidence-Based Medical Statistics. 2: The Bayes Factor
TLDR
The Bayes factor is explored, as nonmathematically as possible, the Bayesian approach to measuring evidence and combining information and epistemologic uncertainties that affect all statistical approaches to inference.
Reverse-Bayes analysis of two common misinterpretations of significance tests
  • L. Held
  • Mathematics, Medicine
    Clinical trials
  • 2013
TLDR
The implications of two common mistakes in the interpretation of statistical significance tests imply strong and often unrealistic assumptions on the prior proportion or probability of truly effective treatments.
Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence
Abstract The problem of testing a point null hypothesis (or a “small interval” null hypothesis) is considered. Of interest is the relationship between the P value (or observed significance level) and
Why should clinicians care about Bayesian methods
Abstract There is a growing awareness of Bayesian methods within the medical research community, and increasing discussion of their potential applications. This interest has, however, so far failed
...
1
2
3
4
5
...