BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle

@inproceedings{West2019BottleSumUA,
  title={BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle},
  author={Peter West and Ari Holtzman and Jan Buys and Yejin Choi},
  booktitle={EMNLP/IJCNLP},
  year={2019}
}
The principle of the Information Bottleneck (Tishby et al. 1999) is to produce a summary of information X optimized to predict some other relevant information Y. In this paper, we propose a novel approach to unsupervised sentence summarization by mapping the Information Bottleneck principle to a conditional language modelling objective: given a sentence, our approach seeks a compressed sentence that can best predict the next sentence. [...] Key Method Our iterative algorithm under the Information Bottleneck…Expand
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction
TLDR
This work proposes a new state-of-the art for unsupervised sentence summarization according to ROUGE scores, and demonstrates that the commonly reported RouGE F1 metric is sensitive to summary length. Expand
RepSum: Unsupervised Dialogue Summarization based on Replacement Strategy
In the field of dialogue summarization, due to the lack of training data, it is often difficult for supervised summary generation methods to learn vital information from dialogue context. SeveralExpand
Unsupervised Opinion Summarization as Copycat-Review Generation
TLDR
A generative model for a review collection is defined which capitalizes on the intuition that when generating a new review given a set of other reviews of a product, the authors should be able to control the “amount of novelty” going into the new review or, equivalently, vary the extent to which it deviates from the input. Expand
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
TLDR
This work introduces a novel method that encourages the inclusion of key terms from the original document into the summary that attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision. Expand
Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes
TLDR
Experimental results show that SuTaT is superior on unsupervised dialogue summarization for both automatic and human evaluations, and is capable of dialogue classification and single-turn conversation generation. Expand
Unsupervised Multi-Document Opinion Summarization as Copycat-Review Generation
TLDR
A hierarchical variational autoencoder model is defined that, when generating a new review given a set of other reviews of the product, can control the `amount of novelty' going into the new review or, equivalently, vary the degree of deviation from the input reviews. Expand
Improving Unsupervised Extractive Summarization with Facet-Aware Modeling
  • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
  • Computer Science
  • FINDINGS
  • 2021
TLDR
Experimental results show that the novel facet-aware centrality-based ranking model consistently outperforms strong baselines especially in longand multi-document scenarios and even performs comparably to some supervised models. Expand
Deep Differential Amplifier for Extractive Summarization
For sentence-level extractive summarization, there is a disproportionate ratio of selected and unselected sentences, leading to flatting the summary features when optimizing the classification. TheExpand
EASE: Extractive-Abstractive Summarization with Explanations
TLDR
This work presents an explainable summarization system based on the Information Bottleneck principle that is jointly trained for extraction and abstraction in an end-to-end fashion and shows that explanations from this framework are more relevant than simple baselines, without substantially sacrificing the quality of the generated summary. Expand
Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization
TLDR
This work proposes that the lead bias can be leveraged in a simple and effective way in the authors' favor to pretrain abstractive news summarization models on large-scale unlabeled corpus: predicting the leading sentences using the rest of an article. Expand
...
1
2
...

References

SHOWING 1-10 OF 26 REFERENCES
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words. Expand
Simple Unsupervised Summarization by Contextual Matching
TLDR
An unsupervised method for sentence summarization using only language modeling that employs two language models, one that is generic (i.e. pretrained), and the other that is specific to the target domain by using a product-of-experts criteria. Expand
Unsupervised Sentence Compression using Denoising Auto-Encoders
TLDR
Although the models are underperform supervised models based on ROUGE scores, their models are competitive with a supervised baseline based on human evaluation for grammatical correctness and retention of meaning. Expand
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time. Expand
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
TLDR
This work forms a variational auto-encoder for inference in a deep generative model of text in which the latent representation of a document is itself drawn from a discrete language model distribution and shows that generative formulations of both abstractive and extractive compression yield state-of-the-art results when trained on a large amount of supervised data. Expand
Get To The Point: Summarization with Pointer-Generator Networks
TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Expand
Deep Recurrent Generative Decoder for Abstractive Text Summarization
TLDR
A new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN) achieves improvements over the state-of-the-art methods. Expand
Headline Generation Based on Statistical Translation
TLDR
This paper presents results on experiments using this approach, in which statistical models of the term selection and term ordering are jointly applied to produce summaries in a style learned from a training corpus. Expand
...
1
2
3
...