Corpus ID: 3494477

Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use

@inproceedings{Adda2011CrowdsourcingFL,
  title={Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use},
  author={Gilles Adda and Beno{\^i}t Sagot and Kar{\"e}n Fort and J. Mariani},
  booktitle={LTC 2011},
  year={2011}
}
This article is a position paper about crowdsourced microworking systems and especially Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, this type of on-line working platforms allows to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby or wanting some extra cash. We shall demonstrate here… Expand
sloWCrowd: A crowdsourcing tool for lexicographic tasks
The paper presents sloWCrowd, a simple tool developed to facilitate crowdsourcing lexicographic tasks, such as error correction in automatically generated wordnets and semantic annotation of corpora.Expand
Crowdsourcing for Speech: Economic, Legal and Ethical analysis
With respect to spoken language resource production, Crowdsourcing - the process of distributing tasks to an open, unspecified population via the internet - offers a wide range of opportunities:Expand
Being a turker
TLDR
An ethnomethodological analysis of publicly available content on Turker Nation, a general forum for Amazon Mechanical Turk (AMT) users, provides novel depth and detail on how theTurker Nation members operate as economic actors, working out which Requesters and jobs are worthwhile to them. Expand
Can Crowdsourcing be used for Effective Annotation of Arabic?
TLDR
The results obtained showed that annotating Arabic grammatical case is harder than POS tagging, and crowdsourcing for Arabic linguistic annotation requiring expert annotators could be not as effective as other crowdsourcing experiments requiring less expertise and qualifications. Expand
Crowd-Sourcing of Human Judgments of Machine Translation Fluency
TLDR
A large collection of crowd-sourced human judgments for the machine translation systems that participated in the WMT 2012 shared translation task is gathered across a range of eight different assessment configurations to gain insight into possible causes of – and remedies for – inconsistency in human judgments. Expand
Relationship-Based Business Process Crowdsourcing?
TLDR
The complexities of the relationships between worker and organisation are revealed and it is argued that designing some aspects of these relationships into crowdsourcing platforms and applications is as beneficial for the organisation as it is for the worker. Expand
An extensive review of tools for manual annotation of documents
TLDR
Motivation Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms, and some tools are comprehensive and mature enough to be used on most annotation projects. Expand
Active learning for detection of stance components
TLDR
Automatic detection of five language components, which are all relevant for expressing opinions and for stance taking, was studied, but not for the two sentiment categories, for which results achieved when using active learning were similar to those achieved when applying a random selection of training data. Expand
Automated agents for reward determination for human work in crowdsourcing applications
TLDR
This work considers the problem of designing automated agents for automatic reward determination and negotiation in crowdsourcing applications and presents two automated agents, based on two different models of human behavior, which outperform strategies developed by human experts. Expand
PAL, a tool for Pre-annotation and Active Learning
TLDR
The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. Expand
...
1
2
...

References

SHOWING 1-10 OF 43 REFERENCES
Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine?
TLDR
To define precisely what MTurk is and what it is not, it is hoped that this will point out opportunities for the community to deliberately value ethics above cost savings. Expand
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popularExpand
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
TLDR
The Question-Answer Sentence Pairs (QASP) corpus is introduced and it is believed that this corpus can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern. Expand
Who are the crowdworkers?: shifting demographics in mechanical turk
TLDR
How the worker population has changed over time is described, shifting from a primarily moderate-income, U.S. based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers. Expand
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
TLDR
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks. Expand
Using the Amazon Mechanical Turk for transcription of spoken language
TLDR
It was found that transcriptions from MTurk workers were generally quite accurate, and when transcripts for the same utterance produced by multiple workers were combined using the ROVER voting scheme, the accuracy of the combined transcript rivaled that observed for conventional transcription methods. Expand
Paragraph Acquisition and Selection for List Question Using Amazon's Mechanical Turk
TLDR
Using the Amazon's Mechanical Turk to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004 is described, and a collection of 1300 gold-standard supporting paragraphs for list questions is built. Expand
Task search in a human computation market
TLDR
The main findings are that on a large scale, workers sort by which tasks are most recently posted and which have the largest number of tasks available and that at least some employers try to manipulate the position of their task in the search results to exploit the tendency to search for recently posted tasks. Expand
Demographics of Mechanical Turk
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivationExpand
Phrase Detectives: A Web-based collaborative annotation game
TLDR
The first version of Phrase Detectives is presented, to the authors' knowledge the first game designed for collaborative linguistic annotation on the Web and applying this method to linguistic annotation tasks like anaphoric annotation. Expand
...
1
2
3
4
5
...