Detecting and avoiding likely false‐positive findings – a practical guide
@article{Forstmeier2017DetectingAA, title={Detecting and avoiding likely false‐positive findings – a practical guide}, author={Wolfgang Forstmeier and Eric-Jan Wagenmakers and Timothy H. Parker}, journal={Biological Reviews}, year={2017}, volume={92} }
Recently there has been a growing concern that many published research findings do not hold up in attempts to replicate them. We argue that this problem may originate from a culture of ‘you can publish if you found a significant effect’. This culture creates a systematic bias against the null hypothesis which renders meta‐analyses questionable and may even lead to a situation where hypotheses become difficult to falsify. In order to pinpoint the sources of error and possible solutions, we…
245 Citations
Perturbations on the uniform distribution of p-values can lead to misleading inferences from null-hypothesis testing
- Environmental ScienceTrends in Neuroscience and Education
- 2017
The statistical significance filter leads to overoptimistic expectations of replicability
- EconomicsJournal of Memory and Language
- 2018
Evidence that nonsignificant results are sometimes preferred: Reverse P-hacking or selective reporting?
- PsychologyPLoS biology
- 2019
If researchers less often report significant findings and/or reverse P-hack to avoid significant outcomes that undermine the ethos that experimental and control groups only differ with respect to actively manipulated variables, it is expected significant results from tests for group differences to be under-represented in the literature.
Do p Values Lose Their Meaning in Exploratory Analyses? It Depends How You Define the Familywise Error Rate
- Psychology
- 2017
Several researchers have recently argued that p values lose their meaning in exploratory analyses due to an unknown inflation of the alpha level (e.g., Nosek & Lakens, 2014; Wagenmakers, 2016). For…
Modern statistics, multiple testing and wishful thinking
- MedicineOccupational and Environmental Medicine
- 2018
An article in this issue by Lenters et al 1 uses simulation to address some questions which should be well understood in the epidemiology community, but sadly are not.
Multiplicity Eludes Peer Review: The Case of COVID-19 Research
- MathematicsInternational journal of environmental research and public health
- 2021
The need to pay special attention to the increased chance of false discoveries in observational studies, including non-replicated striking discoveries with a potentially large social impact, is concluded by an exploratory analysis of the Web of Science database for COVID-19 observational studies.
Does preregistration improve the credibility of research findings?
- Business
- 2020
Preregistration entails researchers registering their planned research hypotheses, methods, and analyses in a time-stamped document before they undertake their data collection and analyses. This…
In Search of the Significant p. Its Influence on the Credibility of Publications
- Psychology
- 2020
Publishing study results in a peer-reviewed journal represents the ultimate goal of research in any field of science and it is obviously assumed that the results are correct and supported by a…
Handling effect size dependency in meta-analysis
- EconomicsInternational Review of Sport and Exercise Psychology
- 2021
ABSTRACT The statistical synthesis of quantitative effects within primary studies via meta-analysis is an important analytical technique in the scientific toolkit of modern researchers. As with any…
Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis
- BusinessCHI
- 2020
This study pore over nine published research studies and conducts semi-structured interviews with their authors to confirm that researchers may experiment with choices in search of desirable results, but also identify other reasons why researchers explore alternatives yet omit findings.
References
SHOWING 1-10 OF 158 REFERENCES
The natural selection of bad science
- BiologyRoyal Society Open Science
- 2016
A 60-year meta-analysis of statistical power in the behavioural sciences is presented and it is shown that power has not improved despite repeated demonstrations of the necessity of increasing power, and that replication slows but does not stop the process of methodological deterioration.
Significance chasing in research practice: causes, consequences and possible solutions.
- MedicineAddiction
- 2015
Significance chasing, questionable research practices and poor study reproducibility are the unfortunate consequence of a 'publish or perish' culture and a preference among journals for novel findings.
Publication Bias: The "File-Drawer" Problem in Scientific Inference
- Economics
- 1999
Publication bias arises whenever the probability that a study is published depends on the statistical significance of its results. This bias, often called the file-drawer effect since the unpublished…
Why Most Published Research Findings Are False
- BusinessPLoS medicine
- 2005
Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true.
Consequences of Prejudice Against the Null Hypothesis
- Psychology
- 1975
The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the research-publication process and (b) case studies of…
False-Positive Psychology
- PsychologyPsychological science
- 2011
It is shown that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false- positive rates, and a simple, low-cost, and straightforwardly effective disclosure-based solution is suggested.
An Agenda for Purely Confirmatory Research
- PsychologyPerspectives on psychological science : a journal of the Association for Psychological Science
- 2012
This article proposes that researchers preregister their studies and indicate in advance the analyses they intend to conduct, and proposes that only these analyses deserve the label “confirmatory,” and only for these analyses are the common statistical tests valid.
Estimating the reproducibility of psychological science
- PsychologyScience
- 2015
A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
- PsychologyBehavioral Ecology and Sociobiology
- 2010
Full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone, and favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.
p-Curve and p-Hacking in Observational Research
- Computer SciencePloS one
- 2016
The p-curve for observational research in the presence of p-hacking is analyzed and it is shown that even with minimal omitted-variable bias (e.g., unaccounted confounding) p- Curve based on true effects and p-Curves based on null-effects with p-Hacking cannot be reliably distinguished.