On the cross-lingual transferability of multilingual prototypical models across NLU tasks

  title={On the cross-lingual transferability of multilingual prototypical models across NLU tasks},
  author={Oralie Cattan and Sophie Rosset and Christophe Servan},
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications when a sufficient number of training examples are available. In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages. Domain and language models are supposed to grow and change as the problem space evolves. On one hand, research on transfer learning has demonstrated the cross-lingual… 

Tables from this paper

Transformer-Based Multilingual Language Models in Cross-Lingual Plagiarism Detection

This work evaluates and compares the effectiveness of 6 pretrained Transformer-based language models in the task of crosslingual sentence alignment to determine the best model in terms of processing speed, accuracy, and cross-lingual transferability.

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

This paper benchmarks thirteen well-established Transformer-based models on the two available spoken language understanding tasks for French: MEDIA and ATIS-FR and shows that compact models can reach comparable results to bigger ones while their ecological impact is considerably lower.

State-of-the-art generalisation research in NLP: a taxonomy and review

A taxonomy for characterising and understanding generalisation research in NLP is presented, a taxonomy is used to present a comprehensive map of published generalisation studies, and recommendations for which areas might deserve attention in the future are made.



Unsupervised Cross-lingual Representation Learning at Scale

It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

It is argued that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures.

Online adaptative zero-shot learning spoken language understanding using word-embedding

This proposition can significantly improve the performance of the spoken language understanding module on the second Dialog State Tracking Challenge (DSTC2) datasets and an online adaptative strategy allowing to refine progressively the initial model with only a light and adjustable supervision is extended.

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label, than self-supervised pre-training or multi-task training.

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

This work proposes a novel end-to-end model that learns to align and predict slots in a multilingual NLU system and uses the corpus to explore various cross-lingual transfer methods focusing on the zero-shot setting and leveraging MT for language expansion.

Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks

This paper explores the model-agnostic meta-learning algorithm (MAML) and its variants for low-resource NLU tasks and empirically demonstrates that the learned representations can be adapted to new tasks efficiently and effectively.

How Multilingual is Multilingual BERT?

It is concluded that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs, and that the model can find translation pairs.

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

This paper proposes a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text, and shows that this meta-training leads to better few-shot generalization than language-model pre-training followed by finetuning.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

Deep Contextualized Word Representations

A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.