Many Labs 2: Investigating Variation in Replicability Across Samples and Settings

@article{Klein2018ManyL2,
  title={Many Labs 2: Investigating Variation in Replicability Across Samples and Settings},
  author={R. A. Klein and M. Vianello and F. Hasselman and B. Adams and R. Adams and Sinan Alper and M. Aveyard and Jordan R Axt and Mayowa T. Babalola and {\vS}těp{\'a}n Bahn{\'i}k and R. Batra and M. Berkics and M. Bernstein and D. R. Berry and Olga Bialobrzeska and Evans Dami Binan and K. Bocian and M. Brandt and Robert Busching and Anna Redei and Huajian Cai and F. Cambier and K. Cantarero and Cheryl L Carmichael and F. C{\'e}ric and Jesse J Chandler and Jen-Ho Chang and A. Chatard and E. Chen and Winnee Cheong and D. Cicero and S. Coen and Jennifer A Coleman and Brian Collisson and Morgan Conway and Katherine S. Corker and P. Curran and F. Cushman and Z. Dagona and Ilker Dalgar and A. D. Rosa and W. E. Davis and M. D. Bruijn and Leander De Schutter and T. Devos and M. D. Vries and Canay Doğulu and Nerisa Dozo and K. Dukes and Yarrow Dunham and K. Durrheim and C. Ebersole and J. Edlund and Anja Eller and Alexander S English and C. Finck and Natalia Frankowska and M. Freyre and Mike Friedman and E. Galliani and Joshua Chiroma Gandi and Tanuka Ghoshal and S. Giessner and Tripat Gill and Timo Gnambs and {\'A}ngel G{\'o}mez and R. Gonz{\'a}lez and J. Graham and Jon E. Grahe and Ivan Grahek and Eva Green and Kakul Hai and M. Haigh and Elizabeth L. Haines and Michael P. Hall and Marie E. Heffernan and Joshua A. Hicks and P. Houdek and Jeffrey R. Huntsinger and H. Huynh and H. Ijzerman and Y. Inbar and {\AA}. Innes-Ker and William Jimenez-Leal and Melissa-Sue John and Jennifer A. Joy-Gaba and Roza G. Kamiloğlu and Heather B Kappes and Serdar Karabati and H. Karick and Victor N. Keller and Anna Kende and Nicolas Kervyn and G. Kne{\vz}evi{\'c} and C. Kovacs and Lacy E Krueger and G. Kurapov and J. Kurtz and D. Lakens and Ljiljana B. Lazarevi{\'c} and C. Levitan and Neil Lewis and Samuel Lins and Nikolette Lipsey and Joy E Losee and E. Maassen and Angela T Maitner and W. Malingumu and Robyn K Mallett and Satia A. Marotta and Janko Međedovi{\'c} and Fernando Mena-Pacheco and T. Milfont and Wendy L. Morris and Sean C Murphy and A. Myachykov and N. Neave and K. Neijenhuijs and A. J. Nelson and F{\'e}lix Neto and Austin Lee Nichols and A. Ocampo and S. O'Donnell and Haruka Oikawa and M. Oikawa and E. Ong and G{\'a}bor Orosz and Malgorzata Osowiecka and Grant Packard and Rolando P{\'e}rez-S{\'a}nchez and B. Petrovi{\'c} and Ronaldo Pilati and B. Pinter and L. Podesta and Gabrielle Pogge and M. Pollmann and Abraham M. Rutchick and Patricio Saavedra and Alexander K Saeri and E. Salomon and Kathleen Schmidt and Felix D. Sch{\"o}nbrodt and M. Sekerdej and David Sirlop{\'u} and Jeanine L. M. Skorinko and M. A. Smith and V. Smith-Castro and K. Smolders and A. Sobkow and W. Sowden and Philipp Spachtholz and M. Srivastava and Troy G Steiner and J. Stouten and Chris N. H. Street and Oskar K. Sundfelt and S. Szeto and E. Szumowska and Andrew C. W. Tang and Norbert K Tanzer and Morgan J. Tear and Jordan E. Theriault and M. Thomae and David Torres and Jakub Traczyk and Joshua M. Tybur and A. Ujhelyi and R. V. Aert and M. V. Assen and M. Hulst and P. V. Lange and A. V. Veer and Alejandro Echeverr{\'i}a and L. Vaughn and A. V{\'a}zquez and L. D. Vega and Catherine Verniers and M. Verschoor and Ingrid Voermans and M. Vranka and C. A. Welch and A. Wichman and L. Williams and M. Wood and Julie A. Woodzicka and M. Wronska and L. Young and J. Zelenski and Zeng Zhi-jia and Brian A. Nosek},
  journal={Advances in Methods and Practices in Psychological Science},
  year={2018},
  volume={1},
  pages={443 - 490}
}
We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p < .05), we found that 15 (54%) of the replications provided evidence of a… Expand

Figures from this paper

Many Labs 5: Testing pre-data collection peer review as an intervention to increase replicability
TLDR
Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes were 78% smaller, on average, than the original effect sizes. Expand
Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability
Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting theExpand
Assessing heterogeneity and power in replications of psychological experiments.
In this study, we reanalyze recent empirical research on replication from a meta-analytic perspective. We argue that there are different ways to define "replication failure," and that analyses canExpand
Postprint - Heterogeneity in direct replications in psychology and its association with effect size
We examined the evidence for heterogeneity (of effect sizes) when only minor changes to sample population and settings were made between studies and explored the association between heterogeneity andExpand
The importance of heterogeneity in large-scale replications
In a large-scale replication effort, Klein et al. (2018, https://doi.org/10.1177/2515245918810225) investigate the variation in replicability and effect size across many different samples andExpand
Statistical methods for replicability assessment
Large-scale replication studies like the Reproducibility Project: Psychology (RP:P) provide invaluable systematic data on scientific replicability, but most analyses and interpretations of the dataExpand
Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015
TLDR
It is found that peer beliefs of replicability are strongly related to replicable, suggesting that the research community could predict which results would replicate and that failures to replicate were not the result of chance alone. Expand
What Meta-Analyses Reveal About the Replicability of Psychological Research
TLDR
The low power and high heterogeneity that the survey finds fully explain recent difficulties to replicate highly regarded psychological studies and reveal challenges for scientific progress in psychology. Expand
Replicator degrees of freedom allow publication of misleading failures to replicate
TLDR
It is shown that “replicator degrees of freedom” make it far too easy to obtain and publish false-negative replication results, even while appearing to adhere to strict methodological standards. Expand
Tilburg University Heterogeneity in direct replications in psychology and Its association with effect size Olsson-Collentine,
We examined the evidence for heterogeneity (of effect sizes) when only minor changes to sample population and settings were made between studies and explored the association between heterogeneity andExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 243 REFERENCES
Investigating variation in replicability: A “Many Labs” replication project
Although replication is a central tenet of science, direct replications are rare in psychology. This research tested variation in the replicability of 13 classic and contemporary effects across 36Expand
Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015
TLDR
It is found that peer beliefs of replicability are strongly related to replicable, suggesting that the research community could predict which results would replicate and that failures to replicate were not the result of chance alone. Expand
Contextual sensitivity in scientific reproducibility
TLDR
It is found that the extent to which the research topic was likely to be contextually sensitive was associated with replication success, and this relationship remained a significant predictor of replication success even after adjusting for characteristics of the original and replication studies that previously had been associated with replicate success. Expand
Theory building through replication response to commentaries on the "Many labs" replication project
While direct replications such as the “Many Labs” project are extremely valuable in testing the reliability of published findings across laboratories, they reflect the common reliance in psychologyExpand
Evaluating replicability of laboratory experiments in economics
TLDR
To contribute data about replicability in economics, 18 studies published in the American Economic Review and the Quarterly Journal of Economics between 2011 and 2014 are replicated, finding that two-thirds of the 18 studies examined yielded replicable estimates of effect size and direction. Expand
Generalizing from Survey Experiments Conducted on Mechanical Turk: A Replication Approach
To what extent do survey experimental treatment effect estimates generalize to other populations and contexts? Survey experiments conducted on convenience samples have often been criticized on theExpand
The Generalizability of Survey Experiments*
Abstract Survey experiments have become a central methodology across the social sciences. Researchers can combine experiments’ causal power with the generalizability of population-based samples. Yet,Expand
Estimating the reproducibility of psychological science
TLDR
A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams. Expand
Many Labs 3: Evaluating participant pool quality across the academic semester via replication
Abstract The university participant pool is a key resource for behavioral research, and data quality is believed to vary over the course of the academic semester. This crowdsourced project examinedExpand
The Alleged Crisis and the Illusion of Exact Replication
  • W. Stroebe, F. Strack
  • Medicine, Psychology
  • Perspectives on psychological science : a journal of the Association for Psychological Science
  • 2014
TLDR
It is proposed that for meaningful replications, attempts at reinstating the original circumstances are not sufficient and replicators must ascertain that conditions are realized that reflect the theoretical variable(s) manipulated (and/or measured) in the original study. Expand
...
1
2
3
4
5
...