A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs

@inproceedings{Toutanova2016ADA,
  title={A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs},
  author={Kristina Toutanova and Chris Brockett and Ke M. Tran and Saleema Amershi},
  booktitle={EMNLP},
  year={2016}
}
We introduce a manually-created, multireference dataset for abstractive sentence and short paragraph compression. First, we examine the impact of singleand multi-sentence level editing operations on human compression quality as found in this corpus. We observe that substitution and rephrasing operations are more meaning preserving than other operations, and that compressing in context improves quality. Second, we systematically explore the correlations between automatic evaluation metrics and… CONTINUE READING