Finding a Balanced Degree of Automation for Summary Evaluation

  title={Finding a Balanced Degree of Automation for Summary Evaluation},
  author={Shiyue Zhang and Mohit Bansal},
Human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs. Automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment. In this work, we propose flexible semiautomatic to automatic summary evaluation metrics, following the Pyramid human evaluation method. Semi-automatic Lite2Pyramid retains the reusable human-labeled Summary Content Units (SCUs) for reference(s) but replaces the manual work of judging SCUs… 

Figures and Tables from this paper

