Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

@inproceedings{Fort2011CrowdsourcingFL,
  title={Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use},
  author={Kar{\"e}n Fort and Gilles Adda and Beno{\^i}t Sagot and J. Mariani and Alain Couillault},
  booktitle={LTC},
  year={2011}
}
This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform… Expand
Yes, We Care! Results of the Ethics and Natural Language Processing Surveys
We present here the context and results of two surveys (a French one and an international one) concerning Ethics and NLP, which we designed and conducted between June and September 2015. TheseExpand
Crowdsourcing construction of information retrieval test collections for conversational speech
TLDR
Best practices for the design of crowdsourcing tasks to improve crowd workers’ performance are described, including what factors influence the quality of relevance judgments on conversational speech and differences between relevance judgements from experts and crowd workers. Expand
Fast, Cheap, and Unethical? The Interplay of Morality and Methodology in Crowdsourced Survey Research
Crowdsourcing is an increasingly popular method for researchers in the social and behavioral sciences, including experimental philosophy, to recruit survey respondents. Crowdsourcing platforms, suchExpand
An Application that Invites Users to Participate in Developing Repository of Bahasa Indonesia
TLDR
The potential is strong for an application that is potential as a tool to build repository in Bahasa by involving the participation of its users to be used in repository building. Expand
The Sloleks Morphological Lexicon and its Future Development
This paper presents Sloleks, the largest open-source machine-readable morphological lexicon of the Slovene language to date. We first briefly present its development and the formal grammar behind it,Expand
Collecting and Evaluating Lexical Polarity with A Game With a Purpose
TLDR
LikeIt, a GWAP (Game With A Purpose) that allows to attribute a positive, negative or neutral value to a term, and thus obtain a resulting polarity for most of the terms of the freely available lexical network of the JeuxDeMots project is designed. Expand
Creating a ground truth multilingual dataset of news and talk show transcriptions through crowdsourcing
TLDR
A detailed comparison of the results obtained with the two crowdsourcing methods tested is provided, the main characteristics of the final ground truth resource created as well as the methodology adopted, and the guidelines prepared for its development are described. Expand
Mixing Crowdsourcing and Graph Propagation to Build a Sentiment Lexicon: Feelings Are Contagious
TLDR
A method to combine crowdsourcing via a Game With A Purpose (GWAP) with automated propagation of sentiments through a spreading algorithm, both using the lexical JeuxDeMots network as data source and substratum is described. Expand
Reexaminatin on Voting for Crowd Sourcing MT Evaluation
We describe a model based on Ranking Support Vector Machine(SVM) used to deal with the crowdsourcing data. Our model focuses on how to use poor quality crowdsourcing data to get high quality sortedExpand
Crowdsourced 'R&D' and medical research.
  • C. Callaghan
  • Computer Science, Medicine
  • British medical bulletin
  • 2015
TLDR
Crowdsourced R&D has properties well suited to large-scale medical data collection and analysis, as well as enabling rapid research responses to crises such as disease outbreaks. Expand
...
1
2
...

References

SHOWING 1-10 OF 36 REFERENCES
Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine?
TLDR
To define precisely what MTurk is and what it is not, it is hoped that this will point out opportunities for the community to deliberately value ethics above cost savings. Expand
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popularExpand
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
TLDR
The Question-Answer Sentence Pairs (QASP) corpus is introduced and it is believed that this corpus can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern. Expand
Using the Amazon Mechanical Turk for transcription of spoken language
TLDR
It was found that transcriptions from MTurk workers were generally quite accurate, and when transcripts for the same utterance produced by multiple workers were combined using the ROVER voting scheme, the accuracy of the combined transcript rivaled that observed for conventional transcription methods. Expand
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
TLDR
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks. Expand
Paragraph Acquisition and Selection for List Question Using Amazon's Mechanical Turk
TLDR
Using the Amazon's Mechanical Turk to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004 is described, and a collection of 1300 gold-standard supporting paragraphs for list questions is built. Expand
Who are the crowdworkers?: shifting demographics in mechanical turk
TLDR
How the worker population has changed over time is described, shifting from a primarily moderate-income, U.S. based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers. Expand
Phrase Detectives: A Web-based collaborative annotation game
TLDR
The first version of Phrase Detectives is presented, to the authors' knowledge the first game designed for collaborative linguistic annotation on the Web and applying this method to linguistic annotation tasks like anaphoric annotation. Expand
Demographics of Mechanical Turk
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivationExpand
Automatic Acquisition of a Slovak Lexicon from a Raw Corpus
This paper presents an automatic methodology we used in an experiment to acquire a morphological lexicon for the Slovak language, and the lexicon we obtained. This methodology extends and refinesExpand
...
1
2
3
4
...