Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments

  title={Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments},
  author={Kyle A. Thomas and Scott Clifford},
  journal={Comput. Hum. Behav.},
Social science researchers increasingly recruit participants through Amazon's Mechanical Turk (MTurk) platform. Yet, the physical isolation of MTurk participants, and perceived lack of experimental control have led to persistent concerns about the quality of the data that can be obtained from MTurk samples. In this paper we focus on two of the most salient concernsthat MTurk participants may not buy into interactive experiments and that they may produce unreliable or invalid data. We review… Expand
Online panels in social science research: Expanding sampling methods beyond Mechanical Turk
It is concluded that online research panels offer a unique opportunity for research, yet one with some important trade-offs, as compared with traditional student subject pools. Expand
Amazon Mechanical Turk workers can provide consistent and economically meaningful data
We explore the consistency of the characteristics of individuals who participate in studies posted on Amazon Mechanical Turk (AMT). The primary individuals analyzed in this study are subjects whoExpand
The Experimenters' Dilemma: Inferential Preferences over Populations
We compare three populations commonly used in experiments by economists and other social scientists: undergraduate students at a physical location (lab), Amazon’s Mechanical Turk (MTurk), andExpand
Exclusion Criteria in Experimental Philosophy
When experimental philosophers carry out studies on thought experiments, some participants are excluded based on certain exclusion criteria, mirroring standard social science vignette methodology.Expand
MTurk, Prolific or Panels? Choosing the Right Audience for Online Research
Researchers in various fields of behavioral research are increasingly using online audiences to conduct studies and surveys, but there is still considerable uncertainty about the quality of theExpand
Distinction without a difference? An assessment of MTurk Worker types
Amazon’s Mechanical Turk (MTurk) platform is a popular tool for scholars seeking a reasonably representative population to recruit subjects for academic research that is cheaper than contract workExpand
Improving Data Quality Using Amazon Mechanical Turk Through Platform Setup
Focusing on task setup, selected platform-level strategies that have received relatively less attention in previous research are empirically tested to further enhance the contribution of the proposed best practices for MTurk usage. Expand
Analyze the Attentive & Bypass Bias: Mock Vignette Checks in Survey Experiments
Respondent inattentiveness threatens to undermine experimental studies. In response, researchers incorporate measures of attentiveness into their analyses, yet often in a way that risks introducingExpand
Noncompliant responding: Comparing exclusion criteria in MTurk personality research to improve data quality
Abstract Studies on Amazon Mechanical Turk (MTurk) often include check questions in personality inventories to ensure data quality. However, a subset of MTurk workers may give only meaningfulExpand
Are Manipulation Checks Necessary?
The use of manipulation checks in mediational analyses does not rule out confounding variables, as any unmeasured variables that correlate with the manipulation check may still drive the relationship. Expand


Data Collection in a Flat World: The Strengths and Weaknesses of Mechanical Turk Samples
MTurk offers a highly valuable opportunity for data collection, and it is recommended that researchers using MTurk include screening questions that gauge attention and language comprehension, avoid questions with factual answers, and consider how individual differences in financial and social domains may influence results. Expand
Evaluating Amazon's Mechanical Turk as a Tool for Experimental Behavioral Research
This paper replicates a diverse body of tasks from experimental psychology including the Stroop, Switching, Flanker, Simon, Posner Cuing, attentional blink, subliminal priming, and category learning tasks using participants recruited using AMT. Expand
Are samples drawn from Mechanical Turk valid for research on political ideology?
Amazon’s Mechanical Turk (MTurk) is an increasingly popular tool for the recruitment of research subjects. While there has been much focus on the demographic differences between MTurk samples and theExpand
Amazon's Mechanical Turk
Findings indicate that MTurk can be used to obtain high-quality data inexpensively and rapidly and the data obtained are at least as reliable as those obtained via traditional methods. Expand
Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants
In three online studies, participants from MTurk and collegiate populations participated in a task that included a measure of attentiveness to instructions (an instructional manipulation check: IMC), and MTurkers were more attentive to the instructions than were college students, even on novel IMCs. Expand
Crowdsourcing performance evaluations of user interfaces
MTurk may be a productive setting for conducting performance evaluations of user interfaces providing a complementary approach to existing methodologies, and three previously well-studied user interface designs are evaluated. Expand
A reliability analysis of Mechanical Turk data
  • S. Rouse
  • Psychology, Computer Science
  • Comput. Hum. Behav.
  • 2015
MTurk-based responses for a personality scale were found to be significantly less reliable than scores previously reported for a community sample, and the presence of an item asking respondents to affirm that they were attentive and honest was associated with more reliable responses. Expand
Evaluating Online Labor Markets for Experimental Research:'s Mechanical Turk
It is shown that respondents recruited in this manner are often more representative of the U.S. population than in-person convenience samples but less representative than subjects in Internet-based panels or national probability samples. Expand
Generalizing from Survey Experiments Conducted on Mechanical Turk: A Replication Approach
  • A. Coppock
  • Psychology
  • Political Science Research and Methods
  • 2018
To what extent do survey experimental treatment effect estimates generalize to other populations and contexts? Survey experiments conducted on convenience samples have often been criticized on theExpand
The pitfall of experimenting on the web: How unattended selective attrition leads to surprising (yet false) research conclusions.
The authors find that experimental studies using online samples (e.g., MTurk) often violate the assumption of random assignment, because participant attrition-quitting a study before completing itExpand