A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

@inproceedings{Cohan2018ADA,
  title={A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents},
  author={Arman Cohan and Franck Dernoncourt and Doo Soon Kim and Trung Bui and Seokhwan Kim and W. Chang and Nazli Goharian},
  booktitle={NAACL},
  year={2018}
}
Neural abstractive summarization models have led to promising results in summarizing relatively short documents. [...] Key Method Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary. Empirical results on two large-scale datasets of scientific papers show that our model significantly outperforms state-of-the-art models.Expand
Extractive Summarization of Long Documents by Combining Global and Local Context
TLDR
A novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic, where it outperforms previous work, both extractive and abstractive models. Expand
Improved Document Modelling with a Neural Discourse Parser
TLDR
This paper proposes to use neural discourse representations obtained from a rhetorical structure theory (RST) parser to enhance document representations generated for discourse spans, known as the elementary discourse units (EDUs). Expand
Predicting Discourse Trees from Transformer-based Neural Summarizers
TLDR
Experiments across models and datasets reveal that the summarizer learns both, dependency- and constituency-style discourse information, which is typically encoded in a single head, covering long- and short-distance discourse dependencies. Expand
About ? Extreme Summarization with Topic-Aware Convolutional Neural Networks
We introduce extreme summarization, a new single-document summarization task which aims at creating a short, one-sentence news summary answering the question “What is the article about?”. We argueExpand
An Editorial Network for Enhanced Document Summarization
We suggest a new idea of Editorial Network - a mixed extractive-abstractive summarization approach, which is applied as a post-processing step over a given sequence of extracted sentences. OurExpand
Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization
TLDR
This paper employs attention mechanisms to interact between different granularity of semantic representations, which helps to capture multi-granularity key information and improves the performance of both abstractive and extractive summarization. Expand
Discourse-Aware Hierarchical Attention Network for Extractive Single-Document Summarization
TLDR
This work proposes a discourse-aware neural extractive summarizer which can explicitly take into account the discourse dependency tree structure of the source document and achieves competitive or better performances against state-of-the-art models in terms of ROUGE scores on the DailyMail dataset. Expand
Inducing Document Structure for Aspect-based Summarization
TLDR
It is shown that the benefit of the learnt document structure can leverage the structure to produce both abstractive and extractive aspect-based summaries, and that structure is particularly advantageous for summarizing long documents. Expand
The Effect of Pretraining on Extractive Summarization for Scientific Documents
TLDR
This work derives significant performance improvements using an intermediate pretraining step that leverages existing summarization datasets and reports state-of-the-art results on a recently released scientific summarization dataset, SciTLDR. Expand
Globalizing BERT-based Transformer Architectures for Long Document Summarization
TLDR
This work introduces a novel hierarchical propagation layer that spreads information between multiple transformer windows and adopts a hierarchical approach where the input is divided in multiple blocks independently processed by the scaled dot-attentions and combined between the successive layers. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 32 REFERENCES
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable toExpand
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time. Expand
Generating Wikipedia by Summarizing Long Sequences
TLDR
It is shown that generating English Wikipedia articles can be approached as a multi- document summarization of source documents and a neural abstractive model is introduced, which can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. Expand
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries. Expand
Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization
TLDR
A cascaded attention based unsupervised model to estimate the salience information from the text for compressive multi-document summarization and achieves better results than the state-of-the-art methods. Expand
Coarse-to-Fine Attention Models for Document Summarization
TLDR
A novel coarse-to-fine attention model that hierarchically reads a document, using coarse attention to select top-level chunks of text and fine attention to read the words of the chosen chunks, which achieves the desired behavior of sparsely attending to subsets of the document for generation. Expand
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
TLDR
A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task. Expand
Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion
TLDR
This paper details the design of a generic extractive summarization system, which ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, the system ranked third out of 35 systems. Expand
Challenges in Data-to-Document Generation
TLDR
A new, large-scale corpus of data records paired with descriptive documents is introduced, a series of extractive evaluation methods for analyzing performance are proposed, and baseline results are obtained using current neural generation methods. Expand
...
1
2
3
4
...