Analyzing the capabilities of crowdsourcing services for text summarization

@article{Lloret2013AnalyzingTC,
  title={Analyzing the capabilities of crowdsourcing services for text summarization},
  author={Elena Lloret and Laura Plaza and Ahmet Aker},
  journal={Language Resources and Evaluation},
  year={2013},
  volume={47},
  pages={337-369}
}
This paper presents a detailed analysis of the use of crowdsourcing services for the Text Summarization task in the context of the tourist domain. In particular, our aim is to retrieve relevant information about a place or an object pictured in an image in order to provide a short summary which will be of great help for a tourist. For tackling this task, we proposed a broad set of experiments using crowdsourcing services that could be useful as a reference for others who want to rely also on… Expand
A Crowdsourcing Approach to Evaluate the Quality of Query-based Extractive Text Summaries
TLDR
This work analyzes the feasibility and appropriateness of micro-task crowdsourcing for evaluation of different summary quality characteristics and reports an ongoing work on the crowdsourced evaluation of query-based extractive text summaries. Expand
Deployment strategies for crowdsourcing text creation
TLDR
This work formalizes a deployment strategy in crowdsourcing along three dimensions: work structure, workforce organization, and work style, and implements these strategies for translation, summarization, and narrative writing tasks by designing a semi-automatic tool that uses the Amazon Mechanical Turk API. Expand
Crowdsourcing versus the laboratory: Towards crowd-based linguistic text quality assessment of query-based extractive summarization
TLDR
It is found that microtask crowdsourcing shows high applicability for determining the factors overall quality, grammaticality, non-redundancy, referential clarity, focus, and structure & coherence in query-based extractive text summarization of online forum discussions. Expand
How Do Order and Proximity Impact the Readability of Event Summaries?
TLDR
This paper conducts an empirical study on a crowdsourcing platform to get insights into regularities that make a text summary coherent and readable and releases data to facilitate future work like designing dedicated measures to evaluate summary structures. Expand
Tweet‐biased summarization
TLDR
The results show that incorporating social information into the summary generation process can improve the accuracy of summary, and the user preference for TBS was significantly higher than GS. Expand
Best Practices for Crowd-based Evaluation of German Summarization: Comparing Crowd, Expert and Automatic Evaluation
TLDR
This work proposes crowdsourcing as a fast, scalable, and cost-effective alternative to expert evaluations to assess the intrinsic and extrinsic quality of summarization by comparing crowd ratings with expert ratings and automatic metrics such as ROUGE, BLEU, or BertScore on a German summarization data set. Expand
Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps
TLDR
A newly created corpus of concept maps is presented that summarize heterogeneous collections of web documents on educational topics that is created using a novel crowdsourcing approach that allows us to efficiently determine important elements in large document collections. Expand
The challenging task of summary evaluation: an overview
TLDR
A clear up-to-date overview of the evolution and progress of summarization evaluation is provided, giving the reader useful insights into the past, present and latest trends in the automatic evaluation of summaries. Expand
Quality Features for Summarizing Text Forum Threads by Selecting Quality Replies
TLDR
This paper aims at selecting quality replies about a topic raised in the initial-post which provide a short summary of the discussion topic and shows that the proposed approach can improve the performance of the text forum threads summarization based on forum quality features and crowdsourcing. Expand
Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
TLDR
A new approach for creating hierarchical summarization corpora is presented by first, extracting relevant content from large, heterogeneous document collections using crowdsourcing and second, ordering the relevant information hierarchically by trained annotators. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 32 REFERENCES
Assessing Crowdsourcing Quality through Objective Tasks
TLDR
One of the interesting findings is that the results do not confirm previous studies which concluded that an increase in payment attracts more noise, and the country of origin only has an impact in some of the categories and only in general text questions but there is no significant difference at the top pay. Expand
Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution
The use of crowdsourcing platforms like Amazon Mechanical Turk for evaluating the relevance of search results has become an effective strategy that yields results quickly and inexpensively. OneExpand
Model Summaries for Location-related Images
TLDR
A corpus of human generated model captions in English and German is described and it is found that a high correlation in ROUGE scores between post-edited and non-post-edited model summaries which indicates that the expensive process of post-editing is not necessary. Expand
Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria
TLDR
An empirical study is conducted to examine the effect of noisy annotations on the performance of sentiment classification models, and evaluate the utility of annotation selection on classification accuracy and efficiency. Expand
Summarization system evaluation revisited: N-gram graphs
TLDR
A novel automatic method for the evaluation of summarization systems, based on comparing the character n-gram graphs representation of the extracted summaries and a number of model summaries, which appears to hold a level of evaluation performance that matches and even exceeds other contemporary evaluation methods. Expand
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
TLDR
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks. Expand
Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: $100 for a 10-day Rush
TLDR
This paper reports on experiments in the creation of a bi-lingual Textual Entailment corpus, using non-experts' workforce under strict cost and time limitations, and summarizes the methodology adopted, the achieved results, the main problems encountered, and the lessons learned. Expand
Crowdsourcing for relevance evaluation
TLDR
A new approach to evaluation called TERC is described, based on the crowdsourcing paradigm, in which many online users, drawn from a large community, each performs a small evaluation task. Expand
Crowdsourcing user studies with Mechanical Turk
TLDR
Although micro-task markets have great potential for rapidly collecting user measurements at low costs, it is found that special care is needed in formulating tasks in order to harness the capabilities of the approach. Expand
ROUGE: A Package for Automatic Evaluation of Summaries
TLDR
Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations. Expand
...
1
2
3
4
...