Null hypothesis significance testing: a review of an old and continuing controversy.

@article{Nickerson2000NullHS,
  title={Null hypothesis significance testing: a review of an old and continuing controversy.},
  author={Raymond S. Nickerson},
  journal={Psychological methods},
  year={2000},
  volume={5 2},
  pages={
          241-301
        }
}
  • R. Nickerson
  • Published 1 June 2000
  • Psychology, Medicine
  • Psychological methods
Null hypothesis significance testing (NHST) is arguably the most widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other objections to its use have also been raised. In this article the author reviews and comments on the claimed misunderstandings as well as on other criticisms of the approach, and he notes arguments… 

Tables and Topics from this paper

The Controversy over Null Hypothesis Significance Testing Revisited
Abstract. Null hypothesis significance testing (NHST) is one of the most widely used methods for testing hypotheses in psychological research. However, it has remained shrouded in controversy
Malignant side effects of null-hypothesis significance testing
Six decades-worth of published information has shown irrefutably that null-hypothesis significance tests (NHSTs) provide no information about the reliability of research outcomes. Nevertheless, they
A Critical Assessment of Null Hypothesis Significance Testing in Quantitative Communication Research
Null hypothesis significance testing (NHST) is the most widely accepted and frequently used approach to statistical inference in quantitative communication research. NHST, however, is highly
When Null Hypothesis Significance Testing Is Unsuitable for Research: A Reassessment
TLDR
It is suggested that, after sustained negative experience, NHST should no longer be the default, dominant statistical practice of all biomedical and psychological research.
Calculating the main alternatives to null-hypothesis-significance testing in between-subject experimental designs.
TLDR
An attempt is made to provide the applied researcher with resources that make it possible to analyse and interpret the results of any research study using a group of indicators that lends a high level of validity to the statistical inference performed.
Calculating the main alternatives to null-hypothesis-significance testing in between-subject experimental designs.
TLDR
An attempt is made to provide the applied researcher with resources that make it possible to analyse and interpret the results of any research study using a group of indicators that lends a high level of validity to the statistical inference performed.
When null hypothesis significance testing is unsuitable for research: a reassessment
TLDR
It is suggested that NHST should no longer be the default, dominant statistical practice of all biomedical and psychological research, and either more in-depth statistical training of more researchers and/or more widespread involvement of professional statisticians in all research is encouraged.
Thou Shalt Not Bear False Witness Against Null Hypothesis Significance Testing
TLDR
This article addresses the NHST debate from the perspective of scientific inquiry and inference, and concludes that NHST procedures remain the only (and suitable) option.
Bayesian Hypothesis Testing: An Alternative to Null Hypothesis Significance Testing (NHST) in Psychology and Social Sciences
Since the mid-1950s, there has been a clear predominance of the Frequentist approach to hypothesis testing, both in psychology and in social sciences. Despite its popularity in the field of
Significance, truth and proof of p values: reminders about common misconceptions regarding null hypothesis significance testing
TLDR
This article seeks to extend Fayer's paper on statistically significant correlations and to clarify some of the controversies regarding statistical significance testing by explaining that (1) the p value is not the probability of the null hypothesis; (2) rejecting thenull hypothesis does not prove that the alternative hypothesis is true; (3) not rejecting the null hypotheses does not proved that theAlternative hypothesis is false.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 455 REFERENCES
On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test
Criticisms of null-hypothesis significance tests (NHSTs) are reviewed Used as formal, two-valued decision procedures, they often generate misleading conclusions However, critics who argue that NHSTs
The appropriate use of null hypothesis testing.
The many criticisms of null hypothesis testing suggest when it is not useful and what is should not be used for. This article explores when and why its use is appropriate. Null hypothesis testing is
In praise of the null hypothesis statistical test.
Jacob Cohen (1994) raised a number of questions about the logic and information value of the null hypothesis statistical test (NHST). Specifically, he suggested that: (a) The NHST does not tell us
On the Logic and Purpose of Significance Testing
There has been much recent attention given to the problems involved with the traditional approach to null hypothesis significance testing (NHST). Many have suggested that, perhaps, NHST should be
Should Significance Tests be Banned? Introduction to a Special Section Exploring the Pros and Cons
Significance testing of null hypotheses is the standard epistemological method for advancing scientific knowledge in psychology, even though it has drawbacks and it leads to common inferential
Rejoinder: Editorial Policies Regarding Statistical Significance Tests: Further Comments
In this response to Robinson and Levin’s comments on Thompson (1996), it is argued that describing results as “significant” rather than “statistically significant” is confusing to those persons most
Effect sizes and p values: what should be reported and what should be replicated?
TLDR
The most-criticized flaws of NHT can be avoided when the importance of a hypothesis is used to determine that a finding is worthy of report, and when p approximately equal to .05 is treated as insufficient basis for confidence in the replicability of an isolated non-null finding.
Statistical Significance and Replicability
Commentators agree with me that statistical significance does not betoken replicability, but not for the reasons I give. Chow (1998) defends the use of significance tests by arguing that they are
The concept of statistical significance and the controversy about one-tailed tests.
  • H. Eysenck
  • Psychology, Medicine
    Psychological review
  • 1960
TLDR
It is suggested here that most of the disagreements emerging from this controversy stem from a misunderstanding of the term "significance," and it is further suggested that the same misunderstanding runs through many discussions of two-tailed tests as well.
Significance Testing in Psychological Research: Some Persisting Issues
Empirical surveys show that reports of significance tests appear in the vast majority of articles in psychological research journals and are relied on by both investigators and journal reviewers when
...
1
2
3
4
5
...