Cross-lingual Lifelong Learning

  title={Cross-lingual Lifelong Learning},
  author={Meryem M'hamdi and Xiang Ren and Jonathan May},
The longstanding goal of multi-lingual learn- 001 ing has been to develop a universal cross- 002 lingual model that can withstand the changes 003 in multi-lingual data distributions. However, 004 most existing models assume full access to the 005 target languages in advance, whereas in realis- 006 tic scenarios this is not often the case, as new 007 languages can be incorporated later on. In this 008 paper, we present the C ross-lingual L ifelong 009 L earning (CLL) challenge, where a model is… 

Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training

mini-model adaptation is proposed, a compute-efficient alternative that builds a shallow mini-model from a fraction of a large model’s parameters and matches the performance of the standard approach using up to 2.4x less compute.

On Robust Incremental Learning over Many Multilingual Steps

This work proposes a method for robust incremental learning over dozens of training steps using data from a variety of languages and shows that a combination of data-augmentation and an optimized training regime allows the model to continue improving the model even for as many as 10,000 training steps.

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

The proposed pipeline, LAFT-URIEL, improves the spread of gains over the supported languages while reducing the magnitude of language-specific losses incurred, and develops novel netuning strategies that allow us to jointly minimize language- Speci-Speci forgetting while encouraging positive cross-lingual transfer observed in this setup.



On the Cross-lingual Transferability of Monolingual Representations

This work designs an alternative approach that transfers a monolingual model to new languages at the lexical level and shows that it is competitive with multilingual BERT on standard cross-lingUAL classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD).

Unsupervised Cross-lingual Representation Learning at Scale

It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.

XNLI: Evaluating Cross-lingual Sentence Representations

This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.

X-METRA-ADA: Cross-lingual Meta-Transfer learning Adaptation to Natural Language Understanding and Question Answering

This work proposes X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for NLU that adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages and shows that this approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.

MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer

MAD-X is proposed, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations and introduces a novel invertible adapter architecture and a strong baseline method for adapting a pretrained multilingual model to a new language.

Transfer Learning in Natural Language Processing

An overview of modern transfer learning methods in NLP, how models are pre-trained, what information the representations they learn capture, and review examples and case studies on how these models can be integrated and adapted in downstream NLP tasks are presented.

MLQA: Evaluating Cross-lingual Extractive Question Answering

This work presents MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area, and evaluates state-of-the-art cross-lingual models and machine-translation-based baselines onMLQA.

LAMOL: LAnguage MOdeling for Lifelong Language Learning

The results show that LAMOL prevents catastrophic forgetting without any sign of intransigence and can perform five very different language tasks sequentially with only one model.

MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark

A new multilingual dataset, called MTOP, comprising of 100k annotated utterances in 6 languages across 11 domains is presented, and strong zero-shot performance using pre-trained models combined with automatic translation and alignment, and a proposed distant supervision method to reduce the noise in slot label projection are demonstrated.

PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification

PAWS-X, a new dataset of 23,659 human translated PAWS evaluation pairs in six typologically distinct languages, shows the effectiveness of deep, multilingual pre-training while also leaving considerable headroom as a new challenge to drive multilingual research that better captures structure and contextual information.