• Corpus ID: 15852710

Not All HITs Are Created Equal: Controlling for Reasoning and Learning Processes in MTurk

  title={Not All HITs Are Created Equal: Controlling for Reasoning and Learning Processes in MTurk},
  author={Jessica R. Hullman},
Challenges of crowdsourcing human-computer interaction (HCI) experiments on Amazon’s Mechanical Turk include risks posed by the combination of low monetary rewards and worker anonymity. These include how mirroring task structure across HIT or qualification questions may encourage the learning of shallow heuristics, the difficulty in increasing workers’ intrinsic motivation, and how verification questions can interrupt natural reasoning processes leading to a mismatch between experimental and… 
Can the Internet grade math ? Crowdsourcing a complex scoring task and picking the optimal crowd size ∗
This paper presents crowdsourcing as a novel approach to reducing the grading burden of constructed response assessments. We find that the average rating of 10 workers from a commercial crowdsourcing
Can the Internet grade math? Crowdsourcing a complex scoring task and picking the optimal crowd size
C crowdsourcing is presented as a novel approach to reducing the grading burden of constructed response assessments and a novel subsampling procedure is developed, which allows a large data-collection experiment to be split into many smaller pseudo-experiments in such a way as to respect within-worker and between-worker variance.
Narratives in Crowdsourced Evaluation of Visualizations: A Double-Edged Sword?
It is found that narratives may have complex and unanticipated effects, calling for more studies in this area, and evidence that adding data semantics increases accuracy is found.
Crowdwork, crisis and convergence: how the connected crowd organizes information during mass disruption events
A holistic view of crowdwork on social media platforms as collective intelligence manifested within a global cognitive system is concluded, explaining how information is processed through a variety of different activities at different layers within a complex information space that includes crowdworkers, virtual organizations, and social media sites that host both the information and the information processing.
Information Visualization for Decision Making: Identifying Biases and Moving Beyond the Visual Analysis Paradigm. (La visualisation d'information pour la prise de décision: identifier les biais et aller au-delà du paradigme de l'analyse visuelle)
La recherche manque de metriques, de methodes et de travaux empiriques pour valider l'efficacite des visualisations pour the prise de decision, et qu'une decision peut etre ``correcte'' mais neanmoins irrationnelle.


Crowdsourcing user studies with Mechanical Turk
Although micro-task markets have great potential for rapidly collecting user measurements at low costs, it is found that special care is needed in formulating tasks in order to harness the capabilities of the approach.
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popular
Running Experiments on Amazon Mechanical Turk
textabstractAlthough Mechanical Turk has recently become popular among social scientists as a source of experimental data, doubts may linger about the quality of data provided by subjects recruited
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk
It is found that when combined non-expert judgments have a high-level of agreement with the existing gold-standard judgments of machine translation quality, and correlate more strongly with expert judgments than Bleu does, Mechanical Turk can be used to calculate human-mediated translation edit rate (HTER), to conduct reading comprehension experiments with machine translation, and to create high quality reference translations.
Financial incentives and the "performance of crowds"
It is found that increased financial incentives increase the quantity, but not the quality, of work performed by participants, where the difference appears to be due to an "anchoring" effect.
Overcoming intuition: metacognitive difficulty activates analytic reasoning.
Four experiments suggest that System 2 processes are activated by metacognitive experiences of difficulty or disfluency during the process of reasoning that reduced the impact of heuristics and defaults in judgment, reduced reliance on peripheral cues in persuasion, and improved syllogistic reasoning.
Crowdsourcing graphical perception: using mechanical turk to assess visualization design
The viability of Amazon's Mechanical Turk as a platform for graphical perception experiments is assessed and cost and performance data are reported and recommendations for the design of crowdsourced studies are distill.
The online laboratory: conducting experiments in a real labor market
The views on the potential role that online experiments can play within the social sciences are presented, and software development priorities and best practices are recommended.
Eliciting Self-Explanations Improves Understanding
It is shown that self-explanation can also be facilitative when it is explicitly promoted, in the context of learning declarative knowledge from an expository text, and three processing characteristics of self-explaining are considered as reasons for the gains in deeper understanding.