Corpus ID: 15852710

Not All HITs Are Created Equal: Controlling for Reasoning and Learning Processes in MTurk

@inproceedings{Hullman2011NotAH,
  title={Not All HITs Are Created Equal: Controlling for Reasoning and Learning Processes in MTurk},
  author={Jessica R. Hullman},
  year={2011}
}
Challenges of crowdsourcing human-computer interaction (HCI) experiments on Amazon’s Mechanical Turk include risks posed by the combination of low monetary rewards and worker anonymity. These include how mirroring task structure across HIT or qualification questions may encourage the learning of shallow heuristics, the difficulty in increasing workers’ intrinsic motivation, and how verification questions can interrupt natural reasoning processes leading to a mismatch between experimental and… Expand
Can the Internet grade math ? Crowdsourcing a complex scoring task and picking the optimal crowd size ∗
This paper presents crowdsourcing as a novel approach to reducing the grading burden of constructed response assessments. We find that the average rating of 10 workers from a commercial crowdsourcingExpand
Can the Internet grade math? Crowdsourcing a complex scoring task and picking the optimal crowd size
TLDR
C crowdsourcing is presented as a novel approach to reducing the grading burden of constructed response assessments and a novel subsampling procedure is developed, which allows a large data-collection experiment to be split into many smaller pseudo-experiments in such a way as to respect within-worker and between-worker variance. Expand
Narratives in Crowdsourced Evaluation of Visualizations: A Double-Edged Sword?
TLDR
It is found that narratives may have complex and unanticipated effects, calling for more studies in this area, and evidence that adding data semantics increases accuracy is found. Expand
Crowdwork, crisis and convergence: how the connected crowd organizes information during mass disruption events
Social media have experienced widespread adoption in recent years. Though designed and appropriated for a range of purposes, users are consistently turning to these platforms during times of crisisExpand
Information Visualization for Decision Making: Identifying Biases and Moving Beyond the Visual Analysis Paradigm. (La visualisation d'information pour la prise de décision: identifier les biais et aller au-delà du paradigme de l'analyse visuelle)
TLDR
La recherche manque de metriques, de methodes et de travaux empiriques pour valider l'efficacite des visualisations pour the prise de decision, et qu'une decision peut etre ``correcte'' mais neanmoins irrationnelle. Expand

References

SHOWING 1-10 OF 29 REFERENCES
Crowdsourcing user studies with Mechanical Turk
TLDR
Although micro-task markets have great potential for rapidly collecting user measurements at low costs, it is found that special care is needed in formulating tasks in order to harness the capabilities of the approach. Expand
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popularExpand
Running Experiments on Amazon Mechanical Turk
textabstractAlthough Mechanical Turk has recently become popular among social scientists as a source of experimental data, doubts may linger about the quality of data provided by subjects recruitedExpand
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
TLDR
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks. Expand
Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk
TLDR
It is found that when combined non-expert judgments have a high-level of agreement with the existing gold-standard judgments of machine translation quality, and correlate more strongly with expert judgments than Bleu does, Mechanical Turk can be used to calculate human-mediated translation edit rate (HTER), to conduct reading comprehension experiments with machine translation, and to create high quality reference translations. Expand
Financial incentives and the "performance of crowds"
TLDR
It is found that increased financial incentives increase the quantity, but not the quality, of work performed by participants, where the difference appears to be due to an "anchoring" effect. Expand
Overcoming intuition: metacognitive difficulty activates analytic reasoning.
TLDR
Four experiments suggest that System 2 processes are activated by metacognitive experiences of difficulty or disfluency during the process of reasoning that reduced the impact of heuristics and defaults in judgment, reduced reliance on peripheral cues in persuasion, and improved syllogistic reasoning. Expand
Crowdsourcing graphical perception: using mechanical turk to assess visualization design
TLDR
The viability of Amazon's Mechanical Turk as a platform for graphical perception experiments is assessed and cost and performance data are reported and recommendations for the design of crowdsourced studies are distill. Expand
The online laboratory: conducting experiments in a real labor market
TLDR
The views on the potential role that online experiments can play within the social sciences are presented, and software development priorities and best practices are recommended. Expand
Eliciting Self-Explanations Improves Understanding
TLDR
It is shown that self-explanation can also be facilitative when it is explicitly promoted, in the context of learning declarative knowledge from an expository text, and three processing characteristics of self-explaining are considered as reasons for the gains in deeper understanding. Expand
...
1
2
3
...