Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use
@inproceedings{Fort2011CrowdsourcingFL, title={Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use}, author={Kar{\"e}n Fort and Gilles Adda and Beno{\^i}t Sagot and J. Mariani and Alain Couillault}, booktitle={LTC}, year={2011} }
This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform…
20 Citations
Yes, We Care! Results of the Ethics and Natural Language Processing Surveys
- PhilosophyLREC
- 2016
We present here the context and results of two surveys (a French one and an international one) concerning Ethics and NLP, which we designed and conducted between June and September 2015. These…
Large-scale deep linguistic processing IN COLLABORATION WITH: Analyse Linguistique Profonde A Grande Echelle (ALPAGE)
- Linguistics
- 2013
The general aim of PARSEME is increasing and enhancing the ICT support of the European multilingual heritage. This aim is pursued via more detailed objectives: (1) to put multilingualism in focus of…
Crowdsourcing construction of information retrieval test collections for conversational speech
- Computer Science
- 2015
Best practices for the design of crowdsourcing tasks to improve crowd workers’ performance are described, including what factors influence the quality of relevance judgments on conversational speech and differences between relevance judgements from experts and crowd workers.
Fast, Cheap, and Unethical? The Interplay of Morality and Methodology in Crowdsourced Survey Research
- Business
- 2018
Crowdsourcing is an increasingly popular method for researchers in the social and behavioral sciences, including experimental philosophy, to recruit survey respondents. Crowdsourcing platforms, such…
An Application that Invites Users to Participate in Developing Repository of Bahasa Indonesia
- Computer Science2018 International Conference on Computer, Control, Informatics and its Applications (IC3INA)
- 2018
The potential is strong for an application that is potential as a tool to build repository in Bahasa by involving the participation of its users to be used in repository building.
The Sloleks Morphological Lexicon and its Future Development
- Computer Science
- 2017
This paper presents Sloleks, the largest open-source machine-readable morphological lexicon of the Slovene language to date. We first briefly present its development and the formal grammar behind it,…
Collecting and Evaluating Lexical Polarity with A Game With a Purpose
- Computer ScienceRANLP
- 2015
LikeIt, a GWAP (Game With A Purpose) that allows to attribute a positive, negative or neutral value to a term, and thus obtain a resulting polarity for most of the terms of the freely available lexical network of the JeuxDeMots project is designed.
Creating a ground truth multilingual dataset of news and talk show transcriptions through crowdsourcing
- Computer ScienceLang. Resour. Evaluation
- 2017
A detailed comparison of the results obtained with the two crowdsourcing methods tested is provided, the main characteristics of the final ground truth resource created as well as the methodology adopted, and the guidelines prepared for its development are described.
Mixing Crowdsourcing and Graph Propagation to Build a Sentiment Lexicon: Feelings Are Contagious
- Computer ScienceNLDB
- 2016
A method to combine crowdsourcing via a Game With A Purpose (GWAP) with automated propagation of sentiments through a spreading algorithm, both using the lexical JeuxDeMots network as data source and substratum is described.
Dictionary of Modern Slovene: Problems and Solutions
- Computer Science
- 2018
This paper views the dictionary as a multi-tier architecture with a presentation tier, a middle application tier (a back-end application system with a component for semi-automatic data extraction), and a data tier, and presents the structure and some of the technological considerations, which guarantee good extensibility, reliability, and adaptability of the final solution.
References
SHOWING 1-10 OF 43 REFERENCES
Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine?
- GeologyCL
- 2011
To define precisely what MTurk is and what it is not, it is hoped that this will point out opportunities for the community to deliberately value ethics above cost savings.
Who are the Turkers? Worker Demographics in Amazon Mechanical Turk
- Economics
- 2009
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is becoming increasingly popular…
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
- Computer ScienceLREC
- 2008
The Question-Answer Sentence Pairs (QASP) corpus is introduced and it is believed that this corpus can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern.
Using the Amazon Mechanical Turk for transcription of spoken language
- Biology2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010
It was found that transcriptions from MTurk workers were generally quite accurate, and when transcripts for the same utterance produced by multiple workers were combined using the ROVER voting scheme, the accuracy of the combined transcript rivaled that observed for conventional transcription methods.
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
- Computer ScienceEMNLP
- 2008
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Paragraph Acquisition and Selection for List Question Using Amazon's Mechanical Turk
- Computer ScienceLREC
- 2010
Using the Amazon's Mechanical Turk to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004 is described, and a collection of 1300 gold-standard supporting paragraphs for list questions is built.
Who are the crowdworkers?: shifting demographics in mechanical turk
- EconomicsCHI Extended Abstracts
- 2010
How the worker population has changed over time is described, shifting from a primarily moderate-income, U.S. based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers.
Phrase Detectives: A Web-based collaborative annotation game
- Computer Science
- 2008
The first version of Phrase Detectives is presented, to the authors' knowledge the first game designed for collaborative linguistic annotation on the Web and applying this method to linguistic annotation tasks like anaphoric annotation.
Demographics of Mechanical Turk
- Economics
- 2010
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivation…
Automatic Acquisition of a Slovak Lexicon from a Raw Corpus
- LinguisticsTSD
- 2005
This paper presents an automatic methodology we used in an experiment to acquire a morphological lexicon for the Slovak language, and the lexicon we obtained. This methodology extends and refines…