Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk

  title={Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk},
  author={Adam J. Berinsky and Gregory A. Huber and Gabriel S. Lenz},
  journal={Political Analysis},
  pages={351 - 368}
We examine the trade-offs associated with using Amazon.com's Mechanical Turk (MTurk) interface for subject recruitment. We first describe MTurk and its promise as a vehicle for performing low-cost and easy-to-field experiments. We then assess the internal and external validity of experiments performed using MTurk, employing a framework that can be used to evaluate other subject pools. We first investigate the characteristics of samples drawn from the MTurk population. We show that respondents… Expand
Running Behavioral Operations Experiments Using Amazon's Mechanical Turk
Overall, MTurk appears to be an important and relevant tool for researchers in behavioral operations, but researchers are cautioned about slower learning of the M Turk subjects and the use of social preference manipulations on MTurK. Expand
A Technical Guide to Using Amazon's Mechanical Turk in Behavioral Accounting Research
ABSTRACT: Multiple social science researchers claim that online data collection, mainly via Amazon's Mechanical Turk (MTurk), has revolutionized the behavioral sciences (Gureckis et al. 2016; Litman,Expand
Are samples drawn from Mechanical Turk valid for research on political ideology?
Amazon’s Mechanical Turk (MTurk) is an increasingly popular tool for the recruitment of research subjects. While there has been much focus on the demographic differences between MTurk samples and theExpand
Pay Rates and Subject Performance in Social Science Experiments Using Crowdsourced Online Samples
Abstract Mechanical Turk has become an important source of subjects for social science experiments, providing a low-cost alternative to the convenience of using undergraduates while avoiding theExpand
Inside the Turk
Mechanical Turk (MTurk), an online labor market created by Amazon, has recently become popular among social scientists as a source of survey and experimental data. The workers who populate thisExpand
Amazon Mechanical Turk workers can provide consistent and economically meaningful data
We explore the consistency of the characteristics of individuals who participate in studies posted on Amazon Mechanical Turk (AMT). The primary individuals analyzed in this study are subjects whoExpand
Mechanical Turk and the “Don’t Know” Option
ABSTRACT Luskin and Bullock’s (2011) randomized experiment on live-interview respondents found no evidence that American National Election Studies and Time-Sharing Experiments for the Social SciencesExpand
A Preliminary Study of Daily Sample Composition on Amazon Mechanical Turk
Amazon Mechanical Turk (AMT) has become a powerful tool for social scientists due to its inexpensiveness, ease of use, and ability to attract large numbers of workers. While the subject pool isExpand
A Study of Daily Sample Composition on Amazon Mechanical Turk
This work addresses the question of whether HIT posting time/day can have an impact on the population that is sampled and shows that except for gender, there is no statistically significant difference in terms of demographics characteristics as a function of HIT postingTime. Expand
Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments
It is found that insufficient attention is no more a problem among MTurk samples than among other commonly used convenience or high-quality commercial samples, and that MTurK participants buy into interactive experiments and trust researchers as much as participants in laboratory studies. Expand


The online laboratory: conducting experiments in a real labor market
The views on the potential role that online experiments can play within the social sciences are presented, and software development priorities and best practices are recommended. Expand
Running Experiments on Amazon Mechanical Turk
textabstractAlthough Mechanical Turk has recently become popular among social scientists as a source of experimental data, doubts may linger about the quality of data provided by subjects recruitedExpand
Crowdsourcing user studies with Mechanical Turk
Although micro-task markets have great potential for rapidly collecting user measurements at low costs, it is found that special care is needed in formulating tasks in order to harness the capabilities of the approach. Expand
Beyond the “Narrow Data Base”: Another Convenience Sample for Experimental Research
The experimental approach has begun to permeate political science research, increasingly so in the last decade. Laboratory researchers face at least two challenges: determining who to study and howExpand
Who are the crowdworkers?: shifting demographics in mechanical turk
How the worker population has changed over time is described, shifting from a primarily moderate-income, U.S. based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers. Expand
Are your participants gaming the system?: screening mechanical turk workers
A screening process used in conjunction with a survey administered via Amazon.com's Mechanical Turk identified 764 of 1,962 people who did not answer conscientiously and seemed to be most likely to fail the qualification task. Expand
Cambridge Handbook of Experimental Political Science: Students as Experimental Participants
An experiment entails randomly assigning participants to various conditions or manipulations. Given common consent requirements, this means experimenters need to recruit participants who, in essence,Expand
The labor economics of paid crowdsourcing
A model of workers supplying labor to paid crowdsourcing projects is presented and a novel method for estimating a worker's reservation wage - the key parameter in the labor supply model - is introduced. Expand
Financial incentives and the "performance of crowds"
It is found that increased financial incentives increase the quantity, but not the quality, of work performed by participants, where the difference appears to be due to an "anchoring" effect. Expand
Breaking Monotony with Meaning: Motivation in Crowdsourcing Markets
It is found that when a task was framed more meaningfully, workers were more likely to participate and the meaningful treatment increased the quantity of output while the shredded treatment decreased the quality of output. Expand