The significance filter, the winner's curse and the need to shrink

  title={The significance filter, the winner's curse and the need to shrink},
  author={Erik W. van Zwet and Eric A. Cator},
  journal={Statistica Neerlandica},
  • E. Zwet, E. Cator
  • Published 20 September 2020
  • Mathematics
  • Statistica Neerlandica
The "significance filter" refers to focusing exclusively on statistically significant results. Since frequentist properties such as unbiasedness and coverage are valid only before the data have been observed, there are no guarantees if we condition on significance. In fact, the significance filter leads to overestimation of the magnitude of the parameter, which has been called the "winner's curse". It can also lead to undercoverage of the confidence interval. Moreover, these problems become… 

Figures from this paper

A Proposal for Informative Default Priors Scaled by the Standard Error of Estimates
If we have an unbiased estimate of some parameter of interest, then its absolute value is positively biased for the absolute value of the parameter. This bias is large when the signal-to-noise ratio
The statistical properties of RCTs and a proposal for shrinkage.
It is believed it is important to shrink the unbiased estimator, and a method for doing so is proposed and proposed, and it is shown that the shrinkage estimator successfully addresses the exaggeration.
Why multiple hypothesis test corrections provide poor control of false positives in the real world
It is argued that a single well-defined false positive rate (FPR) does not even exist and the freedom scientists have to choose the error rate to control, the collection of tests to include in the adjustment, and the method of correction provides too much flexibility for strong error control.
Sources of information waste in neuroimaging: mishandling structures, thinking dichotomously, and over-reducing data
This work examines three sources of substantial information loss and proposes a Bayesian multilevel modeling framework that closely characterizes the data hierarchies across spatial units and experimental trials to improve inference efficiency, predictive accuracy, and generalizability in neuroimaging.
Open science, the replication crisis, and environmental public health.
  • D. Hicks
  • Medicine
    Accountability in research
  • 2021
It is clear that open data initiatives can promote reproducibility and robustness but do little to promote replicability, but some of the other benefits of open science are reviewed, and some suggestions for funding streams to mitigate the costs of adopted open science practices in environmental public health are offered.
Dichotomous thinking and informational waste in neuroimaging
To improve inference efficiency, predictive accuracy, and generalizability, a Bayesian multilevel modeling framework is proposed and four actionable suggestions to alleviate information waste and to improve reproducibility are made.
Developmental Cognitive Neuroscience in the Era of Networks and Big Data: Strengths, Weaknesses, Opportunities, and Threats
Developmental cognitive neuroscience is being pulled in new directions by network science and big data. Brain imaging [e.g., functional magnetic resonance imaging (fMRI), functional connectivity
Ten simple rules in good research practice for early career researchers
This paper aims to provide early-career researchers with a useful introduction to good research practices.


A Proposal for Informative Default Priors Scaled by the Standard Error of Estimates
If we have an unbiased estimate of some parameter of interest, then its absolute value is positively biased for the absolute value of the parameter. This bias is large when the signal-to-noise ratio
A default prior for regression coefficients
It is argued that the uniform prior is not suitable as a default prior for inference about a regression coefficient in the context of the bio-medical and social sciences and proposed that a more suitable default choice is the normal distribution with mean zero and standard deviation equal to the standard error of the M-estimator.
The Importance of Predefined Rules and Prespecified Statistical Analyses: Do Not Abandon Significance.
A petition proposes retaining P values but abandoning dichotomous statements (significant/nonsignificant), suggests discussing “compatible” effect sizes, denounces “proofs of the null,” and points out that “crucial effects” are dismissed on discovery or refuted on replication because of nonsignificance.
Estimating effect size: Bias resulting from the significance criterion in editorial decisions
Experiments that find larger differences between groups than actually exist in the population are more likely to pass stringent tests of significance and be published than experiments that find
The fallacy of the null-hypothesis significance test.
To the experimental scientist, statistical inference is a research instrument, a processing device by which unwieldy masses of raw data may be refined into a product more suitable for assimilation into the corpus of science, and in this lies both strength and weakness.
Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology.
Abstract Theories in “soft” areas of psychology lack the cumulative character of scientific knowledge. They tend neither to be refuted nor corroborated, but instead merely fade away as people lose
Bayesian methods to overcome the winner’s curse in genetic studies
Parameter estimates for associated genetic variants, report ed in the initial discovery samples, are often grossly inflated compared to the values observed in the follow-up replication samples. This
Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa
Abstract There is some evidence that in fields where statistical tests of significance are commonly used, research which yields nonsignificant results is not published. Such research being unknown to
Abandon Statistical Significance
This work recommends dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences and argues that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures.
Publication decisions revisited: the effect of the outcome of statistical tests on the decision to p
Evidence that published results of scientific investigations are not a representative sample of results of all scientific studies is presented and practice leading to publication bias have not changed over a period of 30 years is indicated.