Corpus ID: 11314673

Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough?

@inproceedings{Lin2004LookingFA,
  title={Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough?},
  author={Chin-Yew Lin},
  booktitle={NTCIR},
  year={2004}
}
ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper discusses the validity of the evaluation method used in the Document Understanding… Expand
Evaluation of Automatic Summaries: Metrics under Varying Data Conditions
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization
Automatically Assessing Machine Summary Content Without a Gold Standard
Automatic evaluation of spoken summaries: the case of language assessment
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 15 REFERENCES
ROUGE: A Package for Automatic Evaluation of Summaries
Manual and automatic evaluation of summaries
Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics
Evaluation method for automatic speech summarization
Evaluating Content Selection in Summarization: The Pyramid Method
Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics
...
1
2
...