Malignant side effects of null-hypothesis significance testing

  title={Malignant side effects of null-hypothesis significance testing},
  author={Marc N Branch},
  journal={Theory \& Psychology},
  pages={256 - 277}
  • M. Branch
  • Published 1 April 2014
  • Psychology
  • Theory & Psychology
Six decades-worth of published information has shown irrefutably that null-hypothesis significance tests (NHSTs) provide no information about the reliability of research outcomes. Nevertheless, they are still the core of editorial decision-making in Psychology. Two reasons appear to contribute to the continuing practice. One, survey information suggests that a majority of psychological researchers incorrectly believe that p values provide information about reliability of results. Two, a… 

Figures from this paper

Thou Shalt Not Bear False Witness Against Null Hypothesis Significance Testing

This article addresses the NHST debate from the perspective of scientific inquiry and inference, and concludes that NHST procedures remain the only (and suitable) option.

New Guidelines for Null Hypothesis Significance Testing in Hypothetico-Deductive IS Research

A balanced account of possible actions that are implementable short-term or long-term and that incentivize or penalize specific practices is developed, to promote policy change amongst journals, scholars and students with a vested interest in hypothetico-deductive information systems research.

The “Reproducibility Crisis:” Might the Methods Used Frequently in Behavior-Analysis Research Help?

  • M. Branch
  • Psychology
    Perspectives on behavior science
  • 2019
A growing realization among researchers that p-values do not provide information that bears on repeatability may offer an opportunity for wider application of research methods frequently used in the research specialty known as Behavior Analysis, as well as a few other research traditions.

The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research

The widespread use of ‘statistical significance’ as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process, and potential arguments against removing significance thresholds are discussed.

Equivalent statistics for a one-sample t-test.

Recent insights into problems with common statistical practice in psychology have motivated scientists to consider alternatives to the traditional frequentist approach that compares p-values to a

Replicability Crisis in Social Psychology: Looking at the Past to Find New Pathways for the Future

Over the last few years, psychology researchers have become increasingly preoccupied with the question of whether findings from psychological studies are generally replicable. The debates have

Significance Testing Needs a Taxonomy

Neyman and Pearson’s approach in the application of statistical analyses using alpha and beta error rates has played a dominant role guiding inferential judgments, appropriately in highly determined situations and inappropriately in scientific exploration.

What’s in a p? Reassessing best practices for conducting and reporting hypothesis-testing research

AbstractSocial science research has recently been subject to considerable criticism regarding the validity and power of empirical tests published in leading journals, and business scholarship is no

Improving Psychological Science through Transparency and Openness: An Overview

An overview of recent discussions concerning replicability and best practices in mainstream psychology with an emphasis on the practical benefists to both researchers and the field as a whole is provided.

Predict, Control, and Replicate to Understand: How Statistics Can Foster the Fundamental Goals of Science

  • P. Killeen
  • Psychology
    Perspectives on behavior science
  • 2019
Several alternatives to null hypothesis testing are sketched: Bayesian, model comparison, and predictive inference (prep).



Null hypothesis significance testing: a review of an old and continuing controversy.

The concluding opinion is that NHST is easily misunderstood and misused but that when applied with good judgment it can be an effective aid to the interpretation of experimental data.

On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test

Criticisms of null-hypothesis significance tests (NHSTs) are reviewed Used as formal, two-valued decision procedures, they often generate misleading conclusions However, critics who argue that NHSTs

Consequences of Prejudice Against the Null Hypothesis

The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the research-publication process and (b) case studies of

The test of significance in psychological research.

  • D. Bakan
  • Psychology
    Psychological bulletin
  • 1966
The test of significance does not provide the information concerning psychological phenomena characteristically attributed to it; and a great deal of mischief has been associated with its use. The

Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology.

Abstract Theories in “soft” areas of psychology lack the cumulative character of scientific knowledge. They tend neither to be refuted nor corroborated, but instead merely fade away as people lose

Significance Tests Die Hard

We present a critique showing the flawed logical structure of statistical significance tests. We then attempt to analyze why, in spite of this faulty reasoning, the use of significance tests

Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy

The historical and logical foundations of the dominant school of medical statistics, sometimes referred to as frequentist statistics, are explored and the logical fallacy at the heart of this system is explicated, which maintains such a tenacious hold on the minds of investigators, policymakers, and journal editors.

The earth is round (p < .05)

After 4 decades of severe criticism, the ritual of null hypothesis significance testing (mechanical dichotomous decisions around a sacred .05 criterion) still persists. This article reviews the

Statistical Significance and Replicability

In spite of arguments to the contrary, psychologists, it is shown here, believe statistical significance (SS) signifies that a finding will replicate. The most visible argument that SS is not an

Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms

The present comment reviews practices revolving around tests of statistical significance testing in an accessible manner; many people who use statistical tests might not place such a premium on the tests if these individuals understood what the tests really do, and what they do not do.