Multiple Trials May Yield Exaggerated Effect Size Estimates

@article{Brand2011MultipleTM,
  title={Multiple Trials May Yield Exaggerated Effect Size Estimates},
  author={Andrew Brand and Michael T. Bradley and Lisa A. Best and George Valentin Stoica},
  journal={The Journal of General Psychology},
  year={2011},
  volume={138},
  pages={1 - 11}
}
ABSTRACT Published psychological research attempting to support the existence of small and medium effect sizes may not have enough participants to do so accurately, and thus, repeated trials or the use of multiple items may be used in an attempt to obtain significance. Through a series of Monte-Carlo simulations, this article describes the results of multiple trials or items on effect size estimates when the averages and aggregates of a dependent measure are analyzed. The simulations revealed a… 
Accuracy of Effect Size Estimates From Published Psychological Experiments Involving Multiple Trials
TLDR
Simulations showed a large increase in observed effect size averages and the power to accept these estimates as statistically significant increased over numbers of trials or items.
Correcting Overestimated Effect Size Estimates in Multiple Trials
TLDR
It is concluded that, in practice, d ′ c together with plausible estimates of inter-trial correlation will produce a more precise effect size range compared to that suggested by Brand and colleagues (2011).
Alpha Values as a Function of Sample Size, Effect Size, and Power: Accuracy over Inference
TLDR
It was evident that sample sizes for most psychological studies are adequate for large effect sizes defined at .8, and it was perhaps doubtful if these ideal levels of alpha and power have generally been achieved for medium effect sizes in actual research, since 170 participants would be required.
Effect Size Overview Population and Sample Effect Sizes
  • Psychology
In statistics, an effect size is a measure of the strength of the relationship between two variables in a statistical population, or a sample-based estimate of that quantity. An effect size
Interpreting Effect Size Estimates through Graphic Analysis of Raw Data Distributions
TLDR
This paper considers and simulate cases where graphical analyses reveal distortion in effect size estimates, and highlights the value of graphing data to interpret effect size Estimates.
The Precision of Effect Size Estimation From Published Psychological Research
TLDR
Additional exploratory analyses revealed that CI widths varied across psychological research areas and thatCI widths were not discernably decreasing over time, and the theoretical implications are discussed along with ways of reducing the CI widthS and thus improving precision of effect size estimation.
More Voodoo Correlations: When Average-Based Measures Inflate Correlations
TLDR
A Monte-Carlo simulation was conducted to assess the extent that a correlation estimate can be inflated when an average-based measure is used in a commonly employed correlational design and reveals that the inflation of the correlation estimateCan be substantial, up to 76%.
What Can Cross-Cultural Correlations Teach Us about Human Nature?
TLDR
This work provides examples of how cross-cultural non-equivalence of measurement gives rise to problems in the context of testing evolutionary hypotheses about human behavior, and offers some suggestions for future research.
Examining Treatment Effects for Single-Case ABAB Designs through Sensitivity Analyses
TLDR
This work performed a series of sensitivity analyses while also exploring ways in which HLM can be used to examine new and different questions when dealing with published single-case data.
Using PPT to account for randomness in perception
TLDR
Most participants in both experiments tended to perform at a less-than-perfect level, even after their scores were corrected, demonstrating that at least one systematic factor influences detection that is not included in signal detection theory.
...
...

References

SHOWING 1-10 OF 27 REFERENCES
Standardized or simple effect size: what should be reported?
  • T. Baguley
  • Psychology
    British journal of psychology
  • 2009
TLDR
Factors such as reliability, range restriction and differences in design that distort standardized effect size unless suitable corrections are employed are explored.
Accuracy of Effect Size Estimates from Published Psychological Research
TLDR
A Monte-Carlo simulation was used to model the biasing of effect sizes in published studies and indicates that, when a predominant bias to publish studies with statistically significant results is coupled with inadequate statistical power, there will be an overestimation ofeffect sizes.
How many repeated measures in repeated measures designs? Statistical issues for comparative trials
  • A. Vickers
  • Psychology, Medicine
    BMC medical research methodology
  • 2003
TLDR
The proposed method offers a rational basis for determining the number of repeat measures in repeat measures designs and is effective in randomized and non-randomized comparative trials.
Do studies of statistical power have an effect on the power of studies
The long-term impact of studies of statistical power is investigated using J. Cohen's (1962) pioneering work as an example. We argue that the impact is nil; the power of studies in the same journal
Statistical power of psychological research: what have we gained in 20 years?
  • J. Rossi
  • Psychology
    Journal of consulting and clinical psychology
  • 1990
TLDR
The implications of these results concerning the proliferation of Type I errors in the published literature, the failure of replication studies, and the interpretation of null (negative) results are emphasized.
Some cautions regarding statistical power in split-plot designs
We show that if overall sample size and effect size are held constant, the power of theF test for a one-way analysis of variance decreases dramatically as the number of groups increases. This
Why P Values Are Not a Useful Measure of Evidence in Statistical Significance Testing
Reporting p values from statistical significance tests is common in psychology's empirical literature. Sir Ronald Fisher saw the p value as playing a useful role in knowledge development by acting as
How many repeated measurements are useful?
  • J. Overall
  • Mathematics
    Journal of clinical psychology
  • 1996
TLDR
Monte Carlo results confirmed that increasing the number of repeated measurements across a fixed treatment period generally had negative or neutral implications for power of the tests of significance in the presence of serial dependencies that produced heterogeneous correlations among the repeated measurements.
Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design.
TLDR
The use of simple summary statistics for analysing repeated measurements in randomized clinical trials with two treatments supports the value of the compound symmetry assumption as a realistic simplification in quantitative planning of repeated measures trials.
Statistical Methods in Psychology Journals: Guidelines and Explanations
In the light of continuing debate over the applications of significance testing in psychology journals and following the publication of Cohen's (1994) article, the Board of Scientific Affairs (BSA)
...
...