Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

  title={Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities},
  author={Zejiang Shen and Kyle Lo and Lauren Jane Yu and Nathan Dahlberg and Margo Schlanger and Doug Downey},
With the advent of large language models, methods for abstractive summarization have made great strides, creating potential for use in applications to aid knowledge workers processing unwieldy document collections. One such setting is the Civil Rights Litigation Clearinghouse (CRLC), 1 which posts information about large-scale civil rights lawsuits, serving lawyers, scholars, and the general public. Today, summarization in the CRLC requires extensive training of lawyers and law students who… 

Figures and Tables from this paper

ClassActionPrediction: A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US

This work releases, for the first time, a challenging LJP dataset focused on class action cases in the US, and shows that the Longformer model clearly outperforms the human baseline (63%), despite only considering the 2,048 tokens.

Task-aware Retrieval with Instructions

TART shows strong capabilities to adapt to a new task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger.

An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

This work develops and releases fully pre-trained HAT models that use segment-wise followed by crosssegment encoders and compares them with Longformer models and partially pre- trained HATs to find that H ATs perform best with cross-segment contextualization throughout the model than alternative configurations that implement either early or late cross-Se segment contextualization.

BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?

Longformer models are trained with the Replaced Token Detection task on legal data to showcase that pretraining efficient LMs is possible using much less compute, and it is shown that both the small and base models outperform their baselines on the in-domain BillSum and out-of-domain PubMed tasks in their respective parameter range.

LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from Short to Long Contexts and for Implication-Based Retrieval

This paper introduces LawngNLI, constructed from U.S. legal opinions with automatic labels with high humanvalidated accuracy, which can train and test systems for implication-based case retrieval and argumentation and benchmarks for indomain generalization from short to long contexts.

Moving beyond word lists: towards abstractive topic labels for human-like topics of scientific documents

An approach to generating human-like topic labels using abstractive multi-document summarization (MDS) and some thoughts on how topic modeling can be used to improve MDS in general are presented.



LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

The Legal General Language Understanding Evaluation Evaluation (LexGLUE) benchmark is introduced, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way and several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks are provided.

BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization

This work presents a novel dataset, BIGPATENT, consisting of 1.3 million records of U.S. patent documents along with human written abstractive summaries, which has the following properties: i) summaries contain a richer discourse structure with more recurring entities, ii) salient content is evenly distributed in the input, and iii) lesser and shorter extractive fragments are present in the summaries.

BookSum: A Collection of Datasets for Long-form Narrative Summarization

The domain and structure of the dataset poses a unique set of challenges for summariza- 016 tion systems, which include: processing very long documents, non-trivial causal and tempo- 018 ral dependencies, and rich discourse structures.

Extractive summarisation of legal texts

Results are encouraging as they achieve state-of-the-art accuracy using robust, automatically generated cue phrase information and the utility of the rhetorical annotation scheme as a model of legal discourse, which provides a clear means for structuring summaries and tailoring them to different types of users.

HAUSS: Incrementally building a summarizer combining multiple techniques

LEDGAR: A Large-Scale Multi-label Corpus for Text Classification of Legal Provisions in Contracts

LEDGAR, a multilabel corpus of legal provisions in contracts, is presented and is the first freely available corpus of its kind and several methods to sample subcopora from the corpus are discussed and implemented and different automatic classification approaches are evaluated.

Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model

This work introduces Multi-News, the first large-scale MDS news dataset, and proposes an end-to-end model which incorporates a traditional extractive summarization model with a standard SDS model and achieves competitive results on MDS datasets.

CaseSummarizer: A System for Automated Summarization of Legal Texts

CaseSummarizer is presented, a tool for automated text summarization of legal documents which uses standard summary methods based on word frequency augmented with additional domain-specific knowledge.

MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

It is found that fine-tuning a multilingually pretrained model (XLM-ROBERTA, MT5) in a single source language leads to catastrophic forgetting of multilingual knowledge and poor zero-shot transfer to other languages.

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

This work proposes pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective, PEGASUS, and demonstrates it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores.