Long Document Summarization in a Low Resource Setting using Pretrained Language Models

  title={Long Document Summarization in a Low Resource Setting using Pretrained Language Models},
  author={Ahsaas Bajaj and Pavitra Dangati and Kalpesh Krishna and Pradhiksha Ashok Kumar and Rheeya Uppaal and Brad Windsor and Eliot Brenner and Dominic Dotterrer and Rajarshi Das and Andrew McCallum},
Abstractive summarization is the task of compressing a long document into a coherent short document while retaining salient information. Modern abstractive summarization methods are based on deep neural networks which often require large training datasets. Since collecting summarization datasets is an expensive and time-consuming task, practical industrial settings are usually low-resource. In this paper, we study a challenging low-resource setting of summarizing long legal briefs with an… 

Figures and Tables from this paper

Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization

This work introduces P URE T EXT, a simple yet effective pre-processing layer that removes low-quality sentences in articles to improve existing summarization models, improving overall model performance, especially on long text articles.

Semantic Self-Segmentation for Abstractive Summarization of Long Documents in Low-Resource Regimes

Experimental outcomes show the Se3 approach significantly improves the performance of abstractive summarization transformers, even with just a dozen of labeled data, achieving new state-of-the-art results on two legal datasets of different domains and contents.

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynamic snippet-level attention weights during decoding, and shows that the proposed dynamic weights provide interpretability of the generation process.

Joint abstractive and extractive method for long financial document summarization

This work proposes an end to end financial narrative summarization system that first selects salient sentences from the document and then paraphrases extracted sentences to generate an overall concise summary that maximises the ROUGE metric with the gold standard summary.

Exploring Neural Models for Query-Focused Summarization

A systematic exploration of neural approaches to QFS, considering two general classes of methods: two-stage extractive-abstractive solutions and end-to-end models, which investigate existing models and explore strategies for transfer learning.

HLDC: Hindi Legal Documents Corpus

The Hindi Legal Documents Corpus (HLDC), a corpus of more than 900K legal documents in Hindi is introduced and the task of bail prediction is introduced, as a use-case for the corpus, and a Multi-Task Learning (MTL) based model is proposed.

ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining

This work introduces a simple technique to capture the argumentative structure of legal documents by integrating argument role labeling into the summarization process and shows that this proposed approach improves performance over strong baselines.

Mitigating Data Scarceness through Data Synthesis, Augmentation and Curriculum for Abstractive Summarization

This paper introduces a method of data synthesis with paraphrasing, a data augmentation technique with sample mixing, and curriculum learning with two new difficulty metrics based on specificity and abstractiveness.

HULAT-UC3M at SimpleText@CLEF-2022: Scientific text simplification using BART

This paper describes the proposed system developed by HULAT-UC3M group from Universidad Carlos III de Madrid to solve Task 3 of SimpleText@CLEF-2022 on scientific text simplification. We present an



On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

A simple extractive step is performed before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with Generating a summary.

Abstract Text Summarization: A Low Resource Challenge

This work builds an abstract text summarizer for the German language text using the state-of-the-art “Transformer” model and proposes an iterative data augmentation approach which uses synthetic data along with the real summarization data for theGerman language.

Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.

Bottom-Up Abstractive Summarization

This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.

Neural Summarization by Extracting Sentences and Words

This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words.

A Neural Attention Model for Abstractive Sentence Summarization

This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task.

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

This work proposes the first model for abstractive summarization of single, longer-form documents (e.g., research papers), consisting of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary.

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.

Generating Wikipedia by Summarizing Long Sequences

It is shown that generating English Wikipedia articles can be approached as a multi- document summarization of source documents and a neural abstractive model is introduced, which can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles.