Significance Tests Harm Progress in Forecasting

@article{Armstrong2007SignificanceTH,
  title={Significance Tests Harm Progress in Forecasting},
  author={J. Scott Armstrong},
  journal={Forecasting Models eJournal},
  year={2007}
}
  • J. Armstrong
  • Published 2007
  • Psychology
  • Forecasting Models eJournal
Based on a summary of prior literature, I conclude that tests of statistical significance harm scientific progress. Efforts to find exceptions to this conclusion have, to date, turned up none. Even when done correctly, significance tests are dangerous. I show that summaries of scientific research do not require tests of statistical significance. I illustrate the dangers of significance tests by examining an application to the M3-Competition. Although the authors of that reanalysis conducted a… Expand
Caveats for using statistical significance tests in research assessments
This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization proceduresExpand
Pitfalls of significance testing and $p$-value variability: An econometrics perspective
Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paperExpand
PERSPECTIVE - Researchers Should Make Thoughtful Assessments Instead of Null-Hypothesis Significance Tests
TLDR
Instead of making NHSTs, researchers should adapt their research assessments to specific contexts and specific research goals, and then explain their rationales for selecting assessment indicators. Expand
The Illusion of Predictability: How Regression Statistics Mislead Experts
TLDR
The implications of the study suggest the need to reconsider the way in which empirical results are presented, and the possible provision of easy-to-use simulation tools that would enable readers of empirical papers to make accurate inferences. Expand
Response to Commentaries on “The illusion of predictability: How regression statistics mislead experts”
Does the manner in which results are presented in empirical studies affect perceptions of the predictability of the outcomes? Noting the predominant role of linear regression analysis in empiricalExpand
Null-hypothesis significance tests in behavioral and management research: We can do better
Null-hypothesis significance tests (NHST) are a very troublesome methodology that dominates the quantitative empirical research in strategy and management. Inherent limitations and inappropriateExpand
Testing University Rankings Statistically : Why this Perhaps is not such a Good Idea after All . Some Reflections on Statistical Power , Effect Size , Random Sampling and Imaginary Populations
In this paper we discuss and question the use of statistical significance tests in relation to university rankings as recently suggested. We outline the assumptions behind and interpretations ofExpand
Inferential misconceptions and replication crisis
TLDR
The most serious flaws related to the p  value are systematized and suggestions of how to prevent them and reduce the rate of false discoveries in the future are discussed. Expand
How Can Significance Tests Be Deinstitutionalized?
The purpose of this article is to propose possible solutions to the methodological problem of null hypothesis significance testing (NHST), which is framed as deeply embedded in the institutionalExpand
The Ombudsman: Verification of Citations: Fawlty Towers of Knowledge?
TLDR
Citations to “Estimating nonresponse bias in mail surveys,” one of the most frequently cited papers from the Journal of Marketing Research, are examined to illustrate faulty citations and recommend that journals include a section on their websites to list all relevant papers that have been overlooked and how the omitted paper relates to the published paper. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Needed: A Ban on the Significance Test
The significance test as currently used is a disaster Whereas most researchers falsely believe that the significance test has an error rate of 5%, empirical studies show the average error rate acrossExpand
Should Significance Tests be Banned? Introduction to a Special Section Exploring the Pros and Cons
Significance testing of null hypotheses is the standard epistemological method for advancing scientific knowledge in psychology, even though it has drawbacks and it leads to common inferentialExpand
Verification of Citations: Fawlty Towers of Knowledge?
TLDR
This paper examines citations to Estimating nonresponse bias in mail surveys, one of the most frequently cited papers from the Journal of Marketing Research, as an exploratory study to illustrate the prevalence of faulty citations and provides specific operational recommendations on adjusting for non response bias. Expand
The Ombudsman: Verification of Citations: Fawlty Towers of Knowledge?
TLDR
Citations to “Estimating nonresponse bias in mail surveys,” one of the most frequently cited papers from the Journal of Marketing Research, are examined to illustrate faulty citations and recommend that journals include a section on their websites to list all relevant papers that have been overlooked and how the omitted paper relates to the published paper. Expand
Statistical Significance with Comments by Editors of Marketing Journals
The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals. The resultsExpand
Debiasing forecasts: how useful is the unbiasedness test?
Abstract A number of studies have demonstrated the improvements in accuracy that can result from correcting judgmental forecasts to remove systematic bias. It has been suggested that theExpand
The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects.
The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals. The resultsExpand
Confusion Over Measures of Evidence (p's) Versus Errors (α's) in Classical Statistical Testing
Confusion surrounding the reporting and interpretation of results of classical statistical tests is widespread among applied researchers, most of whom erroneously believe that such tests areExpand
Findings from Evidence-Based Forecasting: Methods for Reducing Forecast Error
Empirical comparisons of reasonable approaches provide evidence on the best forecasting procedures to use under given conditions. Based on this evidence, I summarize the progress made over the pastExpand
The M3 competition: Statistical tests of the results
The main conclusions of the M3 competition were derived from the analyses of descriptive statistics with no formal statistical testing. One of the commentaries noted that the results had not beenExpand
...
1
2
3
...