Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics

@inproceedings{Saggion2002MetaevaluationOS,
  title={Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics},
  author={Horacio Saggion and Dragomir R. Radev and Simone Teufel and Wai Lam},
  booktitle={COLING},
  year={2002}
}
We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community. 

Topics from this paper

Analysis of Automated Evaluation for Multi-document Summarization Using Content-Based Similarity
  • Li-qing Qiu, Bin Pang
  • Computer Science
    Second International Conference on the Digital Society
  • 2008
TLDR
An automated evaluation method based on content similarity is introduced, and a vector space of words is constructed, on which the cosine similarity of automated summaries and human summaries is computed.
Multilingual Multidocument Summarization Tools and Evaluation
TLDR
A centroid-based sentence extraction system has been developed which decides the content of the summary using texts in different languages and uses sentences from English sources alone to create the final output.
Multilingual Summarization Evaluation without Human Models
TLDR
This work applies a new content-based evaluation framework called Fresa to compute a variety of divergences among probability distributions in text summarization tasks including generic and focus-based multi-document summarization in English and generic single-document summary in French and Spanish.
Summary Evaluation with and without References
TLDR
A new content–based method for the evaluation of text summarization systems without human models which is used to produce system rankings is studied and a variety of divergences among probability distributions are computed.
Evaluating the Efficacy of Summarization Evaluation across Languages
TLDR
This work takes a summarization corpus for eight different languages, and manually annotates generated summaries for focus (precision) and coverage (recall), and finds that using multilingual BERT within BERTScore performs well across all languages.
An Extensive Empirical Study of Automated Evaluation of Multi-Document Summarization
TLDR
An approach to automated evaluation of multi-document summarization by computing the similarities of automated summaries and human summary and scoring the automated summary by their similarities to the human ones is discussed.
Robust Generic and Query-based Summarization
TLDR
A robust summarisation system developed within the GATE architecture that makes use of robust components for semantic tagging and coreference resolution provided by GATE and combines well established statistical techniques developed for the purpose of text summarisation research is presented.
Methods for Automatic Evaluation of Sentence Extract Summaries
TLDR
Three novel techniques to automatically evaluate sentence extract summaries using a fuzzy set theoretic basis and WordNet based hypernymy structures to detect similarity between sentences at abstracted levels are described.
Evaluating N-gram based Evaluation Metrics for Automatic Keyphrase Extraction
TLDR
This paper describes a feasibility study of n-gram-based evaluation metrics for automatic keyphrase extraction, and adapt various evaluation metrics developed for machine translation and summarization, and also the R-precision evaluation metric from keyphrase evaluation.
Multi-Document Biography Summarization
TLDR
A biography summarization system using sentence classification and ideas from information retrieval to generate multi-document biographies is described, among the top performers in task 5–short summaries focused by person questions.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 17 REFERENCES
Developing Infrastructure for the Evaluation of Single and Multi-document Summarization Systems in a Cross-lingual Environment
TLDR
This work describes the development of Language and Evaluation Resources for the evaluation of summaries in English and Chinese and focuses on the resources developed that are made available for the research community.
Centroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies
TLDR
A multi-document summarizer, called MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and two new techniques, based on sentence utility and subsumption, are described.
Concept Identification and Presentation in the Context of Technical Text Summarization
TLDR
A method of text summarization that produces indicative-informative abstracts for technical papers that indicates good performance in both tasks when compared with other summarization technologies.
Cut and Paste Based Text Summarization
TLDR
This work includes a statistically based sentence decomposition program that identifies where the phrases of a summary originate in the original document, producing an aligned corpus of summaries and articles which is used to develop the summarizer.
A Comparison of Rankings Produced by Summarization Evaluation Measures
TLDR
This paper proposes using sentence-rank-based and content-based measures for evaluating extract summaries, and compares these with recall-based evaluation measures.
Automated Text Summarization in SUMMARIST
TLDR
The system’s architecture is described and details of some of its modules, many of them trained on large corpora of text, are provided.
The TIPSTER SUMMAC Text Summarization Evaluation
The TIPSTER Text Summarization Evaluation (SUMMAC) has established definitively that automatic text summarization is very effective in relevance assessment tasks. Summaries as short as 17% of full
Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts
TLDR
Methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content in arbitrary subject areas in accordance with user needs.
Tagging Sentence Boundaries
TLDR
This paper describes an extension of the traditional POS tagging by combining it with the document-centered approach to proper name identification and abbreviation handling that made the resulting system robust to domain and topic shifts.
Summarizing Similarities and Differences Among Related Documents
TLDR
The approach described here exploits the results of recent progress in information extraction to represent salient units of text and their relationships to represent meaningful relations between units based on an analysis of text cohesion and the context in which the comparison is desired.
...
1
2
...