MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

  title={MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer},
  author={Ilias Chalkidis and Manos Fergadiotis and Ion Androutsopoulos},
We introduce MULTI-EURLEX, a new multilingual dataset for topic classification of legal documents. The dataset comprises 65k European Union (EU) laws, officially translated in 23 languages, annotated with multiple labels from the EUROVOC taxonomy. We highlight the effect of temporal concept drift and the importance of chronological, instead of random splits. We use the dataset as a testbed for zero-shot cross-lingual transfer, where we exploit annotated training documents in one language… 

A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction

The recent advances of deep learning have dramatically changed how machine learning, especially in the domain of natural language processing, can be applied to legal domain. However, this shift to

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

The Legal General Language Understanding Evaluation Evaluation (LexGLUE) benchmark is introduced, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way and several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks are provided.

HLDC: Hindi Legal Documents Corpus

The Hindi Legal Documents Corpus (HLDC), a corpus of more than 900K legal documents in Hindi is introduced and the task of bail prediction is introduced, as a use-case for the corpus, and a Multi-Task Learning (MTL) based model is proposed.

A Comparison Study of Pre-trained Language Models for Chinese Legal Document Classification

Several strong PLMs which differ in pre-training corpus on three datasets of Chinese legal documents are trained and Experimental results show that the model pre-trained on the legal corpus demonstrates its high efficiency on all datasets.

Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

Multi-LexSum, a collection of 9,280 expert-authored summaries drawn from ongoing CRLC writing, is introduced, demonstrating that despite the high-quality summaries in the training data, state-of-the-art summarization models perform poorly on this task.

The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification

It is shown that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models, and that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks.

Do We Need a Specific Corpus and Multiple High-Performance GPUs for Training the BERT Model? An Experiment on COVID-19 Dataset

A method of making an unsupervised model called a zero-shot classification model, based on the pre-trained BERT model, which has an accuracy of 27.84%, which is lower than the best-achieved accuracy by 6.73%, but it is comparable.

Corpus for Automatic Structuring of Legal Documents

A corpus of legal judgment documents in English that are segmented into topical and coherent parts that are annotated with a label coming from a list of pre-defined Rhetorical Roles is introduced.

Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

Reframing group-robust algorithms as adaptation algorithms under concept drift, it is found that Invariant Risk Minimization and Spectral Decoupling outperform sampling-based approaches to class imbalance and concept Drift, and lead to much better performance on minority classes.

A Survey on Legal Judgment Prediction: Datasets, Metrics, Models and Challenges

Up-to-date andhensive review of existing LJP tasks, data sets, models andevaluations are provided to help researchers and legal professionals understand the status of LJP.



Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

An architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts using a single BiLSTM encoder with a shared byte-pair encoding vocabulary for all languages, coupled with an auxiliary decoder and trained on publicly available parallel corpora.

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

English is compared against other transfer languages for fine-tuning, and other high-resource languages such as German and Russian often transfer more effectively, especially when the set of target languages is diverse or unknown a priori.

MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer

MAD-X is proposed, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations and introduces a novel invertible adapter architecture and a strong baseline method for adapting a pretrained multilingual model to a new language.

XNLI: Evaluating Cross-lingual Sentence Representations

This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

On the Cross-lingual Transferability of Monolingual Representations

This work designs an alternative approach that transfers a monolingual model to new languages at the lexical level and shows that it is competitive with multilingual BERT on standard cross-lingUAL classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD).

Large-Scale Multi-Label Text Classification on EU Legislation

This work releases a new dataset of 57k legislative documents from EUR-LEX, annotated with ∼4.3k EUROVOC labels, suitable for LMTC, few- and zero-shot learning, and shows that BIGRUs with label-wise attention perform better than other current state of the art methods.

MultiFiT: Efficient Multi-lingual Language Model Fine-tuning

Multi-lingual language model Fine-Tuning (MultiFiT) is proposed to enable practitioners to train and fine-tune language models efficiently in their own language and a zero-shot method using an existing pretrained cross-lingUAL model is proposed.

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

The Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark is introduced, a multi-task benchmark for evaluating the cross-lingually generalization capabilities of multilingual representations across 40 languages and 9 tasks.

mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this