When Does HARKing Hurt? Identifying When Different Types of Undisclosed Post Hoc Hypothesizing Harm Scientific Progress

  title={When Does HARKing Hurt? Identifying When Different Types of Undisclosed Post Hoc Hypothesizing Harm Scientific Progress},
  author={Mark Rubin},
  journal={Review of General Psychology},
  pages={308 - 320}
  • Mark Rubin
  • Published 1 December 2017
  • Psychology
  • Review of General Psychology
Hypothesizing after the results are known, or HARKing, occurs when researchers check their research results and then add or remove hypotheses on the basis of those results without acknowledging this process in their research report (Kerr, 1998). In the present article, I discuss 3 forms of HARKing: (a) using current results to construct post hoc hypotheses that are then reported as if they were a priori hypotheses; (b) retrieving hypotheses from a post hoc literature search and reporting them… 

Figures and Tables from this paper

The Costs of HARKing

  • Mark Rubin
  • Psychology
    The British Journal for the Philosophy of Science
  • 2022
Kerr ([1998]) coined the term ‘HARKing’ to refer to the practice of ‘hypothesizing after the results are known’. This questionable research practice has received increased attention in recent years


. The practice of HARKing—hypothesizing after results are known— is commonly maligned as undermining the reliability of scientific findings. There are several accounts in the literature as to why

Does preregistration improve the credibility of research findings?

  • Mark Rubin
  • Business
    The Quantitative Methods for Psychology
  • 2020
Preregistration entails researchers registering their planned research hypotheses, methods, and analyses in a time-stamped document before they undertake their data collection and analyses. This

Predict, Control, and Replicate to Understand: How Statistics Can Foster the Fundamental Goals of Science

  • P. Killeen
  • Psychology
    Perspectives on behavior science
  • 2019
Several alternatives to null hypothesis testing are sketched: Bayesian, model comparison, and predictive inference (prep).

New developments in research methods

Author(s): Ledgerwood, AM | Abstract: Recent events have placed psychological science at the forefront of a broad movement across scientific disciplines to improve research methods and practices.

Raiders of the lost HARK: a reproducible inference framework for big data science

A HARK-solid, reproducible inference framework suitable for big data, based on models that represent formalization of hypotheses is proposed, which underpins ‘natural selection’ in a knowledge base maintained by the scientific community.

Stage and Sub-stage Models

  • G. Young
  • Psychology
    Causality and Development
  • 2019
This first chapter on the second portion of the book – on a 25-step Neo-Eriksonian stage/sub-stage model – first describes the empirical (replication) crisis in psychology and argues that grand

Navigating the review process through the holier than thou

As the focal article suggests, reviewing is a complex job that requires sophisticated knowledge and skills. One guiding recommendation included within the integrity competency presented in the focal

What type of Type I error? Contrasting the Neyman–Pearson and Fisherian approaches in the context of exact and direct replications

It is concluded that the replication crisis may be partly (not wholly) due to researchers’ unrealistic expectations about replicability based on their consideration of the Neyman–Pearson Type I error rate across a long run of exact replications.



HARKing : Hypothesizing After the Results are Known

  • L.
  • Psychology
  • 2002
This article considers a practice in scientific communication termed HARKing (Hypothesizing After the Results are Known). HARKing is defined as presenting a post hoc hypothesis (i.e., one based on or

HARKing's Threat to Organizational Research: Evidence From Primary and Meta‐Analytic Sources

We assessed presumed consequences of hypothesizing after results are known (HARKing) by contrasting hypothesized versus nonhypothesized effect sizes among 10 common relations in organizational

Forgetting What We Learned as Graduate Students: HARKing and Selective Outcome Reporting in I–O Journal Articles

With an overarching concern on how trustworthy and accurate the accumulated scientific knowledge is in industrial– organizational (I–O) psychology research, Kepes and McDaniel (2013) discuss how

An Agenda for Purely Confirmatory Research

This article proposes that researchers preregister their studies and indicate in advance the analyses they intend to conduct, and proposes that only these analyses deserve the label “confirmatory,” and only for these analyses are the common statistical tests valid.

Why psychologists must change the way they analyze their data: the case of psi: comment on Bem (2011).

It is concluded that Bem's p values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data.

Replicability Crisis in Social Psychology: Looking at the Past to Find New Pathways for the Future

Over the last few years, psychology researchers have become increasingly preoccupied with the question of whether findings from psychological studies are generally replicable. The debates have

Presenting Post Hoc Hypotheses as A Priori: Ethical and Theoretical Issues

  • K. Leung
  • Business, Psychology
    Management and Organization Review
  • 2011
Presenting post hoc hypotheses based on empirical findings as if they had been developed a priori seems common in management papers. The pure form of this practice is likely to breach research ethics

Do p Values Lose Their Meaning in Exploratory Analyses? It Depends How You Define the Familywise Error Rate

Several researchers have recently argued that p values lose their meaning in exploratory analyses due to an unknown inflation of the alpha level (e.g., Nosek & Lakens, 2014; Wagenmakers, 2016). For

Is psychology suffering from a replication crisis? What does "failure to replicate" really mean?

This article suggests that so-called failures to replicate may not be failures at all, but rather are the result of low statistical power in single replication studies, and of failure to appreciate the need for multiple replications in order to have enough power to identify true effects.

A Tutorial on Hunting Statistical Significance by Chasing N

This work systematically illustrates the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings and illustrates that it is extremely easy to introduce strong bias into data by very mild selection and re-testing.