P-Curve: A Key to the File Drawer
@article{Simonsohn2014PCurveAK, title={P-Curve: A Key to the File Drawer}, author={Uri Simonsohn and Leif D. Nelson and Joseph P. Simmons}, journal={Cognitive Linguistics: Cognition}, year={2014} }
Because scientists tend to report only studies (publication bias) or analyses (p-hacking) that "work," readers must ask, "Are these effects true, or do they merely reflect selective reporting?" We introduce p-curve as a way to answer this question. P-curve is the distribution of statistically significant p values for a set of studies (ps < .05). Because only true effects are expected to generate right-skewed p-curves-containing more low (.01s) than high (.04s) significant p values--only right…
1,022 Citations
Detecting Evidential Value and P-Hacking With the P-curve tool: A Word of Caution
- Psychology
- 2019
It is shown that not only selective reporting but also selective nonreporting of significant results due to a significant outcome of a more popular alternative test of the same hypothesis may produce left-skewed p-curves, even if all studies reflect true effects.
P-Curve and Effect Size: Correcting for Publication Bias Using Only Significant Results
- Economics
- 2014
Journals tend to publish only statistically significant evidence, creating a scientific record that markedly overstates the size of effects. We provide a new tool that corrects for this bias without…
Detecting Evidential Value and p-Hacking With the p-Curve Tool
- Computer ScienceZeitschrift für Psychologie
- 2019
It is shown that not only selective reporting but also selective nonreporting of significant results due to a significant outcome of a more popular alternative test of the same hypothesis may produce left-skewed p-curves, even if all studies reflect true effects.
p-Curve and p-Hacking in Observational Research
- Computer SciencePloS one
- 2016
The p-curve for observational research in the presence of p-hacking is analyzed and it is shown that even with minimal omitted-variable bias (e.g., unaccounted confounding) p- Curve based on true effects and p-Curves based on null-effects with p-Hacking cannot be reliably distinguished.
p-Curve and Effect Size
- Computer SciencePerspectives on psychological science : a journal of the Association for Psychological Science
- 2014
Journals tend to publish only statistically significant evidence, creating a scientific record that markedly overstates the size of effects. We provide a new tool that corrects for this bias without…
Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
- Computer SciencePeerJ
- 2016
It is concluded that it is not feasible to use the p-curve to estimate the extent of p-hacking and evidential value unless there is considerable control over the type of data entered into the analysis.
Some properties of p-curves, with an application to gradual publication bias.
- Computer SciencePsychological methods
- 2018
The results of 2 survey experiments support the existence of a cliff effect at p = .05 and suggest that researchers tend to be more likely to recommend submission of an article as the level of statistical significance increases beyond this p level.
The Extent and Consequences of P-Hacking in Science
- Computer SciencePLoS biology
- 2015
It is suggested that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses, and its effect seems to be weak relative to the real effect sizes being measured.
Z-Curve.2.0: Estimating Replication Rates and Discovery Rates
- Computer Science
- 2020
Publication bias, the fact that published studies are not necessarily representative of all conducted studies, poses a significant threat to the credibility of scientific literature. To mitigate the…
Better P-curves: Making P-curve analysis more robust to errors, fraud, and ambitious P-hacking, a Reply to Ulrich and Miller (2015).
- Computer ScienceJournal of experimental psychology. General
- 2015
This work considers the possibility that researchers report only the smallest significant p value, the impact of more common problems, including p-curvers selecting the wrong p values, fake data, honest errors, and ambitiously p-hacked results, and provides practical solutions that substantially increase its robustness.
References
SHOWING 1-10 OF 73 REFERENCES
The file drawer problem and tolerance for null results
- Psychology
- 1979
Quantitative procedures for computing the tolerance for filed and future null results are reported and illustrated, and the implications are discussed.
Inappropriate Fiddling with Statistical Analyses to Obtain a Desirable P-value: Tests to Detect its Presence in Published Literature
- PsychologyPloS one
- 2012
This article presents a method for detecting the presence of manipulation of statistical analyses to push a “near significant p-value” to a level that is considered significant in a distribution of p-values from independent studies.
Publication bias in situ
- MedicineBMC medical research methodology
- 2004
Examples are presented that show how easily PBIS can have a large impact on reported results, as well as how there can be no simple answer to it.
Publication decisions revisited: the effect of the outcome of statistical tests on the decision to p
- Economics
- 1995
Evidence that published results of scientific investigations are not a representative sample of results of all scientific studies is presented and practice leading to publication bias have not changed over a period of 30 years is indicated.
A fail-safe N for effect size in meta-analysis.
- Environmental Science
- 1983
Rosenthan's (1979) concept of fail-safe N has thus far been applied to probability levels exclusively. This note introduces a fail-safe TV for effect size. Rosenthal's (1979) fail-safe N was an…
Tests of Significance for 2 × 2 Contingency Tables
- Mathematics
- 1984
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and…
A peculiar prevalence of p values just below .05
- PsychologyQuarterly journal of experimental psychology
- 2012
In null hypothesis significance testing (NHST), p values are judged relative to an arbitrary threshold for significance (.05). The present work examined whether that standard influences the…
Replication and p Intervals: p Values Predict the Future Only Vaguely, but Confidence Intervals Do Much Better
- PsychologyPerspectives on psychological science : a journal of the Association for Psychological Science
- 2008
P is so unreliable and gives such dramatically vague information that it is a poor basis for inference that researchers should minimize the role of p by using confidence intervals and model-fitting techniques and by adopting meta-analytic thinking.
A Primer on the Understanding, Use, and Calculation of Confidence Intervals that are Based on Central and Noncentral Distributions
- Psychology
- 2001
Reform of statistical practice in the social and behavioral sciences requires wider use of confidence intervals (CIs), effect size measures, and meta-analysis. The authors discuss four reasons for…
Why Most Published Research Findings Are False
- BusinessPLoS medicine
- 2005
Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true.