• Corpus ID: 3494477

Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use

  title={Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use},
  author={Gilles Adda and Beno{\^i}t Sagot and Kar{\"e}n Fort and J. Mariani},
  booktitle={LTC 2011},
This article is a position paper about crowdsourced microworking systems and especially Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, this type of on-line working platforms allows to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby or wanting some extra cash. We shall demonstrate here… 
sloWCrowd: A crowdsourcing tool for lexicographic tasks
The paper presents sloWCrowd, a simple tool developed to facilitate crowdsourcing lexicographic tasks, such as error correction in automatically generated wordnets and semantic annotation of corpora.
Crowdsourcing for Speech: Economic, Legal and Ethical analysis
This article will focus on ethical, legal and economic issues of crowdsourcing in general and of crowdsourced services such as Amazon Mechanical Turk, a major platform for multilingual language resources (LR) production.
Being a turker
An ethnomethodological analysis of publicly available content on Turker Nation, a general forum for Amazon Mechanical Turk (AMT) users, provides novel depth and detail on how theTurker Nation members operate as economic actors, working out which Requesters and jobs are worthwhile to them.
Can Crowdsourcing be used for Effective Annotation of Arabic?
The results obtained showed that annotating Arabic grammatical case is harder than POS tagging, and crowdsourcing for Arabic linguistic annotation requiring expert annotators could be not as effective as other crowdsourcing experiments requiring less expertise and qualifications.
Crowd-Sourcing of Human Judgments of Machine Translation Fluency
A large collection of crowd-sourced human judgments for the machine translation systems that participated in the WMT 2012 shared translation task is gathered across a range of eight different assessment configurations to gain insight into possible causes of – and remedies for – inconsistency in human judgments.
Relationship-Based Business Process Crowdsourcing?
The complexities of the relationships between worker and organisation are revealed and it is argued that designing some aspects of these relationships into crowdsourcing platforms and applications is as beneficial for the organisation as it is for the worker.
A Comparative Study on Collecting High-Quality Implicit Reasonings at a Large-scale
This paper tackles the complex task of warrant explication and devise various methodologies for collecting warrants and finds that their methodologies allow for high-quality warrants to be collected.
Exploring Methodologies for Collecting High-Quality Implicit Reasoning in Arguments
This work proposes and shows how semi-structured warrants can be annotated on a large scale via crowdsourcing, and demonstrates through extensive quality evaluation that its methodologies enable collecting better quality warrants in comparison to unstructured annotations.
An extensive review of tools for manual annotation of documents
Motivation Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms, and some tools are comprehensive and mature enough to be used on most annotation projects.
Annotating Implicit Reasoning in Arguments with Causal Links
This work proposes a semi-structured template to represent argumentation knowledge that explicates the implicit reasoning in arguments via causality and creates a novel two-phase annotation process with simplified guidelines and shows how to collect and filter high quality implicit reasonings via crowdsourcing.


Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine?
To define precisely what MTurk is and what it is not, it is hoped that this will point out opportunities for the community to deliberately value ethics above cost savings.
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popular
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
The Question-Answer Sentence Pairs (QASP) corpus is introduced and it is believed that this corpus can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern.
Who are the crowdworkers?: shifting demographics in mechanical turk
How the worker population has changed over time is described, shifting from a primarily moderate-income, U.S. based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers.
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Using the Amazon Mechanical Turk for transcription of spoken language
It was found that transcriptions from MTurk workers were generally quite accurate, and when transcripts for the same utterance produced by multiple workers were combined using the ROVER voting scheme, the accuracy of the combined transcript rivaled that observed for conventional transcription methods.
Paragraph Acquisition and Selection for List Question Using Amazon's Mechanical Turk
Using the Amazon's Mechanical Turk to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004 is described, and a collection of 1300 gold-standard supporting paragraphs for list questions is built.
Task search in a human computation market
The main findings are that on a large scale, workers sort by which tasks are most recently posted and which have the largest number of tasks available and that at least some employers try to manipulate the position of their task in the search results to exploit the tendency to search for recently posted tasks.
Demographics of Mechanical Turk
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivation
Phrase Detectives: A Web-based collaborative annotation game
The first version of Phrase Detectives is presented, to the authors' knowledge the first game designed for collaborative linguistic annotation on the Web and applying this method to linguistic annotation tasks like anaphoric annotation.