Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization

  title={Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization},
  author={Chujie Zheng and Kunpeng Zhang and Harry J. Wang and Ling Fan and Zhe Wang},
  journal={2021 IEEE International Conference on Big Data (Big Data)},
In this paper, we present a denoising sequence-to-sequence (seq2seq) autoencoder via contrastive learning for abstractive text summarization. Our model adopts a standard Transformer-based architecture with a multi-layer bi-directional encoder and an auto-regressive decoder. To enhance its denoising ability, we incorporate self-supervised contrastive learning along with various sentence-level document augmentation. These two components, seq2seq autoencoder and contrastive learning, are jointly… 


Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.
Text Summarization with Pretrained Encoders
This paper introduces a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences and proposes a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two.
Selective Encoding for Abstractive Sentence Summarization
The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task.
Bottom-Up Abstractive Summarization
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.
Abstractive Document Summarization with a Graph-Based Attentional Neural Model
A novel graph-based attention mechanism in the sequence-to-sequence framework to address the saliency factor of summarization, which has been overlooked by prior works and is competitive with state-of-the-art extractive methods.
NCLS: Neural Cross-Lingual Summarization
This work presents an end-to-end CLS framework, which it refers to as Neural Cross-Lingual Summarization (NCLS), and proposes to further improve NCLS by incorporating two related tasks, monolingual summarization and machine translation, into the training process of CLS under multi-task learning.
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to
A Neural Attention Model for Abstractive Sentence Summarization
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.
Self-Attention Guided Copy Mechanism for Abstractive Summarization
A Transformer-based model is proposed to enhance the copy mechanism by identifying the importance of each source word based on the degree centrality with a directed graph built by the self-attention layer in the Transformer.