• Corpus ID: 173990766

Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

@article{Hoang2019EfficientAO,
  title={Efficient Adaptation of Pretrained Transformers for Abstractive Summarization},
  author={Andrew Hoang and Antoine Bosselut and Asli Celikyilmaz and Yejin Choi},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.00138}
}
Large-scale learning of transformer language models has yielded improvements on a variety of natural language understanding tasks. [] Key Result Finally, we show that these improvements are achieved by producing more focused summaries with fewer superfluous and that performance improvements are more pronounced on more abstractive datasets.

Figures and Tables from this paper

Summary Level Training of Sentence Rewriting for Abstractive Summarization
TLDR
A novel training signal is presented that directly maximizes summary-level ROUGE scores through reinforcement learning and incorporates BERT into the model, making good use of its ability on natural language understanding.
Understanding Neural Abstractive Summarization Models via Uncertainty
TLDR
This work analyzes summarization decoders in both blackbox and whitebox ways by studying on the entropy, or uncertainty, of the model's token-level predictions, and shows that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
Few-Shot Learning for Opinion Summarization
TLDR
This work shows that even a handful of summaries is sufficient to bootstrap generation of the summary text with all expected properties, such as writing style, informativeness, fluency, and sentiment preservation.
Cooperative Generator-Discriminator Networks for Abstractive Summarization with Narrative Flow
TLDR
To promote research toward abstractive summarization with narrative flow, a new dataset is introduced, Scientific Abstract SummarieS (SASS), where the abstracts are used as proxy gold summaries for scientific articles and Co-opNet is proposed, a novel transformer-based framework where the generator works with the discourse discriminator to compose a long-form summary.
Few-Shot Learning for Abstractive Multi-Document Opinion Summarization
TLDR
This work shows that even a handful of summaries is sufficient to bootstrap generation of the summary text with all expected properties, such as writing style, informativeness, fluency, and sentiment preservation.
Extremely Low Resource Text simplification with Pre-trained Transformer Language Model
TLDR
A simple approach which fine-tunes the pre-trained language model for text simplification with a small parallel corpus and shows that TransformerLM, which is a simple text generation model, substantially outperforms a strong baseline.
MTL-DAS: Automatic Text Summarization for Domain Adaptation
TLDR
The unified model MTL-DAS, a unified model for multidomain adaptive text summarization, which stands for Multitask Learning for Multidomains Adaptation Summarization model is proposed and the experiment shows the unified model not only outperforms separately trained models, but also is time-consuming and requires less computational resources.
Restructuring Conversations using Discourse Relations for Zero-shot Abstractive Dialogue Summarization.
TLDR
A zero-shot abstractive dialogue summarization method that uses discourse relations to provide structure to conversations, and then uses an out-of-the-box document summarization model to create final summaries that improves the ROGUE score by up to 3 points, and even performs competitively against other state of the art methods.
Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
TLDR
This study investigated the use of Indonesian BERT in abstractive text summarization on the IndoSum dataset using the BERTSum model and showed that models with more embedding size and used Generative Pre-Training (GPT)-like decoder could improve the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score and BERTScore of the model results.
Discourse Understanding and Factual Consistency in Abstractive Summarization
TLDR
A general framework for abstractive summarization with factual consistency and distinct modeling of the narrative flow in an output summary is introduced and empirical results demonstrate that Co-opNet learns to summarize with considerably improved global coherence compared to competitive baselines.
...
...

References

SHOWING 1-10 OF 28 REFERENCES
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
TLDR
An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Bottom-Up Abstractive Summarization
TLDR
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words.
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
TLDR
This article provides a comprehensive literature survey on different seq2seq models for abstractive text summarization from the viewpoint of network structures, training strategies, and summary generation algorithms.
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries.
Abstractive Document Summarization with a Graph-Based Attentional Neural Model
TLDR
A novel graph-based attention mechanism in the sequence-to-sequence framework to address the saliency factor of summarization, which has been overlooked by prior works and is competitive with state-of-the-art extractive methods.
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
TLDR
This work collects Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit, and proposes a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi- level memory to store the information of text from different levels of abstraction.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
...
...