Estimating the Reproducibility of Experimental Philosophy

@article{Cova2018EstimatingTR,
  title={Estimating the Reproducibility of Experimental Philosophy},
  author={Florian Cova and Brent Strickland and Angela Gaia Felicita Abatista and Aur{\'e}lien Allard and James Andow and Mario Attie and James R. Beebe and Renatas Berniūnas and Jordane Boudesseul and Matteo Colombo and Fiery Andrews Cushman and Rodrigo D{\'i}az and Noah N’Djaye Nikolai van Dongen and Vilius Dranseika and Brian D. Earp and Antonio Gait{\'a}n Torres and Ivar Rodr{\'i}guez Hannikainen and Jos{\'e} V. Hern{\'a}ndez-Conde and Wenjia Hu and François Jaquet and Kareem Khalifa and Hannah Kim and Markus Kneer and Joshua Knobe and Miklos Kurthy and Anthony Lantian and Shen-Yi Liao and Edouard Machery and Tania Moerenhout and Christian Mott and Mark Phelan and Jonathan Scott Phillips and Navin Rambharose and Kevin Reuter and Felipe Romero and Paulo Sousa and Jan Sprenger and Emile Thalabard and Kevin Patrick Tobia and Hugo Viciana and Daniel A. Wilkenfeld and Xiang Zhou},
  journal={Review of Philosophy and Psychology},
  year={2018},
  volume={12},
  pages={9-44}
}
Responding to recent concerns about the reliability of the published literature in psychology and other disciplines, we formed the X-Phi Replicability Project (XRP) to estimate the reproducibility of experimental philosophy ( osf.io/dvkpr ). Drawing on a representative sample of 40 x-phi studies published between 2003 and 2015, we enlisted 20 research teams across 8 countries to conduct a high-quality replication of each study in order to compare the results to the original published findings… 
Reporting in Experimental Philosophy: Current Standards and Recommendations for Future Practice
TLDR
A detailed, comprehensive assessment of current reporting practices in Experimental Philosophy focuses on the quality of statistical reporting and the disclosure of information about study methodology, and makes recommendations for authors, reviewers and editors to facilitate making research statistically-transparent and reproducible.
Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability
TLDR
Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes were 78% smaller, on average, than the original effect sizes.
Statistical reporting inconsistencies in experimental philosophy
TLDR
From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science, while in the psychological and behavioral sciences the rates of inconsistencies are lower than in psychology and philosophy.
New statistical metrics for multisite replication projects
Increasingly, researchers are attempting to replicate published original studies by using large, multisite replication projects, at least 134 of which have been completed or are on going. These
The Replicability Crisis and Public Trust in Psychological Science
Replication failures of past findings in several scientific disciplines, including psychology, medicine, and experimental economics, have created a ‘crisis of confidence’ among scientists.
Predictive Evaluation of Replication Studies
Throughout the last decade, the so-called replication crisis has stimulated many researchers to conduct large-scale replication projects. With data from four of these projects, we computed
The Meta-Science of Adult Statistical Word Segmentation: Part 1
We report the first set of results in a multi-year project to assess the robustness – and the factors promoting robustness – of the adult statistical word segmentation literature. This includes eight
Developmental psychologists should adopt citizen science to improve generalization and reproducibility
Widespread failures of replication and generalization are, ironically, a scientific triumph, in that they confirm the fundamental metascientific theory that underlies our field. Generalizable and
Predicting replication outcomes in the Many Labs 2 study
How best to quantify replication success? A simulation study on the comparison of replication success metrics
TLDR
Generally, meta-analytic approaches seem to slightly outperform metrics that evaluate single studies, except in the scenario of extreme publication bias, where this pattern reverses.
...
...

References

SHOWING 1-10 OF 101 REFERENCES
Estimating the reproducibility of psychological science
TLDR
A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
Statistical reporting inconsistencies in experimental philosophy
TLDR
From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science, while in the psychological and behavioral sciences the rates of inconsistencies are lower than in psychology and philosophy.
Experimental Philosophy: A Methodological Critique
This article offers a critique of research practices typical of experimental philosophy. To that end, it presents a review of methodological issues that have proved crucial to the quality of research
The Alleged Crisis and the Illusion of Exact Replication
  • W. StroebeF. Strack
  • Psychology
    Perspectives on psychological science : a journal of the Association for Psychological Science
  • 2014
TLDR
It is proposed that for meaningful replications, attempts at reinstating the original circumstances are not sufficient and replicators must ascertain that conditions are realized that reflect the theoretical variable(s) manipulated (and/or measured) in the original study.
Facts Are More Important Than Novelty
Despite increased attention to methodological rigor in education research, the field has focused heavily on experimental design and not on the merit of replicating important results. The present
Comment on “Estimating the reproducibility of psychological science”
TLDR
It is shown that this article contains three statistical errors and provides no support for the conclusion that the reproducibility of psychological science is surprisingly low, and that the data are consistent with the opposite conclusion.
Replication, falsification, and the crisis of confidence in social psychology
TLDR
This paper considers the replication debate from a historical and philosophical perspective, and provides a conceptual analysis of both replication and falsification as they pertain to this important discussion.
False-Positive Psychology
TLDR
It is shown that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false- positive rates, and a simple, low-cost, and straightforwardly effective disclosure-based solution is suggested.
Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs
TLDR
A practical primer on how to calculate and report effect sizes for t-tests and ANOVA's such that effect sizes can be used in a-priori power analyses and meta-analyses and a detailed overview of the similarities and differences between within- and between-subjects designs is provided.
...
...