Using Pre-Trained Transformer for Better Lay Summarization

@inproceedings{Kim2020UsingPT,
  title={Using Pre-Trained Transformer for Better Lay Summarization},
  author={Seungwon Kim},
  booktitle={SDP},
  year={2020}
}
  • Seungwon Kim
  • Published in SDP 1 November 2020
  • Computer Science
In this paper, we tack lay summarization tasks, which aim to automatically produce lay summaries for scientific papers, to participate in the first CL-LaySumm 2020 in SDP workshop at EMNLP 2020. We present our approach of using Pre-training with Extracted Gap-sentences for Abstractive Summarization (PEGASUS; Zhang et al., 2019b) to produce the lay summary and combining those with the extractive summarization model using Bidirectional Encoder Representations from Transformers (BERT; Devlin et al… 

Figures and Tables from this paper

Overview and Insights from the Shared Tasks at Scholarly Document Processing 2020: CL-SciSumm, LaySumm and LongSumm
TLDR
The quality and quantity of the submissions show that there is ample interest in scholarly document summarization, and the state of the art in this domain is at a midway point between being an impossible task and one that is fully resolved.

References

SHOWING 1-10 OF 38 REFERENCES
Text Summarization with Pretrained Encoders
TLDR
This paper introduces a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences and proposes a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two.
Get To The Point: Summarization with Pointer-Generator Networks
TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries.
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words.
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to
Pretraining-Based Natural Language Generation for Text Summarization
TLDR
A novel pretraining-based encoder-decoder framework, which can generate the output sequence based on the input sequence in a two-stage manner, which achieves new state-of-the-art on both CNN/Daily Mail and New York Times datasets.
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
TLDR
This work proposes the first model for abstractive summarization of single, longer-form documents (e.g., research papers), consisting of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary.
Headline Generation: Learning from Decomposable Document Titles
TLDR
A novel method for generating titles for unstructured text documents is proposed and the results of a randomized double-blind trial in which subjects were unaware of which titles were human or machine-generated are presented.
Headline Generation: Learning from Decomposed Document Titles
TLDR
A novel method for generating titles for unstructured text documents is proposed and the results of a randomized double-blind trial in which subjects were unaware of which titles were human or machine-generated are presented.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
...
1
2
3
4
...