On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

@article{Pilault2020OnEA,
  title={On Extractive and Abstractive Neural Document Summarization with Transformer Language Models},
  author={Jonathan Pilault and Raymond Li and Sandeep Subramanian and Christopher Joseph Pal},
  journal={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2020},
  volume={ },
  pages={9308–9319}
}
  • Jonathan Pilault, Raymond Li, +1 author C. Pal
  • Published 2020
  • Computer Science
  • Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. [...] Key Method We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared…Expand
Scientific Document Summarization for LaySumm ’20 and LongSumm ’20
Automatic text summarization has been widely studied as an important task in natural language processing. Traditionally, various feature engineering and machine learning based systems have beenExpand
Globalizing BERT-based Transformer Architectures for Long Document Summarization
TLDR
This work introduces a novel hierarchical propagation layer that spreads information between multiple transformer windows and adopts a hierarchical approach where the input is divided in multiple blocks independently processed by the scaled dot-attentions and combined between the successive layers. Expand
Summaformers @ LaySumm 20, LongSumm 20
TLDR
This paper distinguishes between two types of summaries, namely, a very short summary that captures the essence of the research paper in layman terms and a much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper. Expand
Summarization for LaySumm ’ 20 and LongSumm ’ 20
  • 2020
TLDR
This paper distinguishes between two types of summaries, namely, a very short summary that captures the essence of the research paper in layman terms restricting overtly specific technical jargon and a much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper. Expand
Neural Abstractive Unsupervised Summarization of Online News Discussions
TLDR
A novel method that generates abstractive summaries of online news discussions that represents the most relevant aspects of a news item that users comment on, incorporating the social context as a source of information to summarize texts in online social networks. Expand
Long Document Summarization in a Low Resource Setting using Pretrained Language Models
TLDR
A novel algorithm based on GPT-2 (Radford et al., 2019) language model perplexity scores, that operates within the low resource regime of summarizing long legal briefs with an average source document length of 4268 words and only 120 available (document, summary) pairs is studied. Expand
Combination of abstractive and extractive approaches for summarization of long scientific texts
TLDR
A method to generate summaries of long scientific documents that uses the advantages of both extractive and abstractive approaches and used pre-trained transformer-based language models for both extractor and abstractor. Expand
EASE: Extractive-Abstractive Summarization with Explanations
TLDR
This work presents an explainable summarization system based on the Information Bottleneck principle that is jointly trained for extraction and abstraction in an end-to-end fashion and shows that explanations from this framework are more relevant than simple baselines, without substantially sacrificing the quality of the generated summary. Expand
Long-Span Summarization via Local Attention and Content Selection
TLDR
This work exploits large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection, which can achieve comparable or better results than existing approaches. Expand
Long-Span Dependencies in Transformer-based Summarization Systems
TLDR
This work exploits large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection and can achieve comparable or better results than existing approaches. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 52 REFERENCES
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
TLDR
An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores. Expand
Controlling Decoding for More Abstractive Summaries with Copy-Based Networks
TLDR
This paper proposes a simple baseline method that allows us to control the amount of copying without retraining, and indicates that the method provides a strong baseline for abstractive systems looking to obtain high ROUGE scores while minimizing overlap with the source article. Expand
Jointly Extracting and Compressing Documents with Summary State Representations
TLDR
A new neural model for text summarization that first extracts sentences from a document and then compresses them, improving over current extractive and abstractive methods is presented. Expand
Bottom-Up Abstractive Summarization
TLDR
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries. Expand
BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization
TLDR
This work presents a novel dataset, BIGPATENT, consisting of 1.3 million records of U.S. patent documents along with human written abstractive summaries, which has the following properties: i) summaries contain a richer discourse structure with more recurring entities, ii) salient content is evenly distributed in the input, and iii) lesser and shorter extractive fragments are present in the summaries. Expand
Classify or Select: Neural Architectures for Extractive Document Summarization
TLDR
Two novel and contrasting Recurrent Neural Network (RNN) based architectures for extractive summarization of documents are presented and the models under both architectures jointly capture the notions of salience and redundancy of sentences. Expand
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words. Expand
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries. Expand
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
TLDR
By end-to-end training the model with the inconsistency loss and original losses of extractive and abstractive models, the model achieves state-of-the-art ROUGE scores while being the most informative and readable summarization on the CNN/Daily Mail dataset in a solid human evaluation. Expand
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
TLDR
A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task. Expand
...
1
2
3
4
5
...