Get To The Point: Summarization with Pointer-Generator Networks

@article{See2017GetTT,
  title={Get To The Point: Summarization with Pointer-Generator Networks},
  author={A. See and Peter J. Liu and Christopher D. Manning},
  journal={ArXiv},
  year={2017},
  volume={abs/1704.04368}
}
Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text. [] Key Method First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.

Figures and Tables from this paper

Blending approaches to Abstractive Summarization

This paper finds that adding intra-attention on past states of the decoder to the pointer-generator architecture has the potential to improve performance and shows the benefits of incorporating the prediction probabilities from the content-selector directly in the "copy-att attention" mechanism, to improve the bottom-up approach.

Summarization with Highway Condition Radom Pointer-Generator Network

A Highway Condition Radom Pointer-Generator Network (HCRPGN) is proposed, which introduces the CRF layer to solve the duplication problem and use highway recurrent cell to optimize the neuron structure and prevent model degradation.

Reinforced Generative Adversarial Network for Abstractive Text Summarization

A new architecture that combines reinforcement learning and adversarial generative networks to enhance the sequence-to-sequence attention model is proposed, using a hybrid pointer-generator network that copies words directly from the source text, contributing to accurate reproduction of information without sacrificing the ability of generators to generate new words.

Pointer-Generator Abstractive Text Summarization Model with Part of Speech Features

  • Shuxia RenZheming Zhang
  • Computer Science
    2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)
  • 2019
A pointer-generator text summarization model with part of speech features that improves the quality of generated abstracts by combining convolutional neural network (CNN) and bi-directional LSTM and uses pointergenerator network to control whether generating or copying words to solve the problem of OOV.

Reading More Efficiently: Multi-sentence Summarization with a Dual Attention and Copy-Generator Network

A novel model with a dual attention that considers both sentence and word information and then generates a multi-sentence summarization word by word is proposed to solve the out-of-vocabulary (OOV) problem.

Neural Abstractive Summarization on the Gigaword Dataset

A modified bottom-up abstractive summarization pipeline that is inspired by style transfer in computer vision is developed and a model with hierarchical attention is trained in order to model the source documents at both the word and sentence level.

Planning with Entity Chains for Abstractive Summarization

This work proposes to use entity chains (i.e., chains of entities mentioned in the summary) to better plan and ground the generation of abstractive summaries in neural summarization.

Pointer over Attention: An Improved Bangla Text Summarization Approach Using Hybrid Pointer Generator Network

This work augments the attention-based sequence-to-sequence using a hybrid pointer generator network that can generate Out-of-Vocabulary words and enhance accuracy in reproducing authentic details and a coverage mechanism that discourages repetition.

Deep Architectures for Abstractive Text Summarization in Multiple Languages

A new novel method of working with agglutinative languages, it is a preprocessing technique that is applied to the dataset which increases the relevancy of the vocabulary, which effectively increases the efficiency of the text summarization without modifying the models.

More Abstractive Summarization with Pointer-Generator Networks

A new model is formulated that produces comparable performance in terms of ROUGE scores and is able to produce significantly more novel n-grams (at least 30% increase) than the baseline model.
...

References

SHOWING 1-10 OF 33 REFERENCES

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.

SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents

We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task.

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

This paper incorporates copying into neural network-based Seq2Seq learning and proposes a new model called CopyNet with encoder-decoder structure which can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.

A Neural Attention Model for Abstractive Sentence Summarization

This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

Sequence Level Training with Recurrent Neural Networks

This work proposes a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE, and outperforms several strong baselines for greedy generation.

Pointing the Unknown Words

A novel way to deal with the rare and unseen words for the neural network models using attention is proposed using attention, which uses two softmax layers in order to predict the next word in conditional language models.

Modeling Coverage for Neural Machine Translation

This paper proposes coverage-based NMT, which maintains a coverage vector to keep track of the attention history and improves both translation quality and alignment quality over standard attention- based NMT.

Temporal Attention Model for Neural Machine Translation

This work proposes a novel mechanism to address some of these limitations and improve the NMT attention that memorizes the alignments temporally and modulates the attention with the accumulated temporal memory, as the decoder generates the candidate translation.