Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

@article{Scialom2019AnswersUU,
  title={Answers Unite! Unsupervised Metrics for Reinforced Summarization Models},
  author={Thomas Scialom and Sylvain Lamprier and Benjamin Piwowarski and Jacopo Staiano},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.01610}
}
Abstractive summarization approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization. RL enables to consider complex, possibly non differentiable, metrics that globally assess the quality and relevance of the generated outputs. ROUGE, the most used summarization metric, is known to suffer from bias towards lexical similarity as well as from sub-optimal accounting for fluency and readability of the generated abstracts. We thus… Expand
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
TLDR
This work proposes SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Expand
QuestEval: Summarization Asks for Fact-based Evaluation
TLDR
QuestEval substantially improves the correlation with human judgments over four evaluation dimensions (consistency, coherence, fluency, and relevance), as shown in the extensive experiments the authors report. Expand
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
TLDR
This work introduces a novel method that encourages the inclusion of key terms from the original document into the summary that attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision. Expand
SAFEval: Summarization Asks for Fact-based Evaluation
TLDR
SAFEEval substantially improves the correlation with human judgments over four evaluation dimensions (consistency, coherence, fluency, and relevance), as shown in the extensive experiments the authors report. Expand
Rewards with Negative Examples for Reinforced Topic-Focused Abstractive Summarization
We consider the problem of topic-focused abstractive summarization, where the goal is to generate an abstractive summary focused on a particular topic, a phrase of one or multiple words. WeExpand
Towards Human-Free Automatic Quality Evaluation of German Summarization
TLDR
This work demonstrates how to adjust the BLANC metric to a language other than English and shows that BLANC in German is especially good in evaluating informativeness. Expand
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward
TLDR
ASGARD is presented, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD, and proposes the use of dual encoders—a sequential document encoder and a graph-structured encoder—to maintain the global context and local characteristics of entities, complementing each other. Expand
Question-Based Salient Span Selection for More Controllable Text Summarization
TLDR
A method for incorporating question-answering (QA) signals into a summarization model that identifies salient noun phrases in the input document by automatically generating wh-questions that are answered by the NPs and automatically determining whether those questions are answered in the gold summaries. Expand
CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization
  • Shuyang Cao, Lu Wang
  • Computer Science
  • EMNLP
  • 2021
TLDR
It is found that the contrastive learning framework consistently produces more factual summaries than strong comparisons with post error correction, entailmentbased reranking, and unlikelihood training, according to QA-based factuality evaluation. Expand
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
TLDR
QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary, is proposed and is believed to be a promising tool in automatically generating usable and factually consistent text. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
Guiding Extractive Summarization with Question-Answering Rewards
TLDR
This paper argues that quality summaries should serve as document surrogates to answer important questions, and such question-answer pairs can be conveniently obtained from human abstracts and learns to promote summaries that are informative, fluent, and perform competitively on question-answering. Expand
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries. Expand
Question Answering as an Automatic Evaluation Metric for News Article Summarization
TLDR
An end-to-end neural abstractive model is presented that maximizes APES, while increasing ROUGE scores to competitive results, and analyzing the strength of this metric by comparing it to known manual evaluation metrics. Expand
Summarization Evaluation in the Absence of Human Model Summaries Using the Compositionality of Word Embeddings
TLDR
The proposed metric is evaluated in replicating the human assigned scores for summarization systems and summaries on data from query-focused and update summarization tasks in TAC 2008 and 2009. Expand
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
TLDR
An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores. Expand
Better Summarization Evaluation with Word Embeddings for ROUGE
TLDR
This proposal uses word embeddings to overcome the disadvantage of ROUGE in the evaluation of abstractive summarization, or summaries with substantial paraphrasing, by using them to compute the semantic similarity of the words used in summaries instead. Expand
Multi-Reward Reinforced Summarization with Saliency and Entailment
TLDR
This work addresses three important aspects of a good summary via a reinforcement learning approach with two novel reward functions: ROUGESal and Entail, on top of a coverage-based baseline, and shows superior performance improvement when these rewards are combined with traditional metric (ROUGE) based rewards. Expand
Actor-Critic based Training Framework for Abstractive Summarization
TLDR
A training framework for neural abstractive summarization based on actor-critic approaches from reinforcement learning that achieves improvements over the state-of-the-art methods. Expand
Ranking Sentences for Extractive Summarization with Reinforcement Learning
TLDR
This paper conceptualize extractive summarization as a sentence ranking task and proposes a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. Expand
Improving Abstraction in Text Summarization
TLDR
This work decomposes the decoder into a contextual network that retrieves relevant parts of the source document, and proposes a pretrained language model that incorporates prior knowledge about language generation to improve the level of abstraction of generated summaries. Expand
...
1
2
3
4
...