Corpus ID: 58981712

Cross-lingual Language Model Pretraining

@inproceedings{Lample2019CrosslingualLM,
  title={Cross-lingual Language Model Pretraining},
  author={Guillaume Lample and Alexis Conneau},
  booktitle={NeurIPS},
  year={2019}
}
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on… Expand
Alternating Language Modeling for Cross-Lingual Pre-Training
TLDR
This work code-switches sentences of different languages rather than simple concatenation, hoping to capture the rich cross-lingual context of words and phrases, and shows that ALM can outperform the previous pre-training methods on three benchmarks. Expand
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
TLDR
This paper enhances the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings and demonstrates improved performance both on UNMT and bilingual lexicon induction using this method compared to a UNMT baseline. Expand
Cross-Lingual Natural Language Generation via Pre-Training
TLDR
Experimental results on question generation and abstractive summarization show that the model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation and improves NLG performance of low-resource languages by leveraging rich-resource language data. Expand
Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks
TLDR
It is found that doing fine-tuning on multiple languages together can bring further improvement in Unicoder, a universal language encoder that is insensitive to different languages. Expand
Cross-Lingual Language Model Meta-Pretraining
  • Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao
  • Computer Science
  • ArXiv
  • 2021
TLDR
This paper proposes cross-lingual language model metapretraining, which introduces an additional meta-pretraining phase before cross-lingsual pretraining, where the model learns generalization ability on a largescale monolingual corpus and focuses on learningCrosslingual transfer on a multilingual corpus. Expand
Explicit Cross-lingual Pre-training for Unsupervised Machine Translation
TLDR
This paper proposes a new pre-training model called Cross-lingual Masked Language Model (CMLM), which randomly chooses source n-grams in the input text stream and predicts their translation candidates at each time step and significantly improves the performance of unsupervised machine translation. Expand
Exploring Cross-Lingual Transfer Learning with Unsupervised Machine Translation
TLDR
The experimental results show that the application of UMT enables TALL to consistently achieve better CLTL performance than the baseline model, which is the pre-trained multilingual language model serving as the encoder of TALL, without using more annotated data, and the performance gain is relatively prominent in the case of distant languages. Expand
Mixed-Lingual Pre-training for Cross-lingual Summarization
TLDR
This work proposes a solution based on mixed-lingual pre-training that can leverage the massive monolingual data to enhance its modeling of language and has no task-specific components, which saves memory and increases optimization efficiency. Expand
Cross-lingual Language Model Pretraining for Retrieval
TLDR
This paper introduces two novel retrieval-oriented pretraining tasks to further pretrain cross-lingual language models for downstream retrieval tasks such as cross-lingsual ad-hoc retrieval (CLIR), and proposes to directly finetune language models on part of the evaluation collection by making Transformers capable of accepting longer sequences. Expand
Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model
TLDR
A novel unsupervised feature decomposition method that can automatically extract domain-specific features and domain-invariant features from the entangled pretrained cross-lingual representations, given unlabeled raw texts in the source language is proposed. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 52 REFERENCES
XNLI: Evaluating Cross-lingual Sentence Representations
TLDR
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines. Expand
Phrase-Based & Neural Unsupervised Machine Translation
TLDR
This work investigates how to learn to translate when having access to only large monolingual corpora in each language, and proposes two model variants, a neural and a phrase-based model, which are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. Expand
Word Translation Without Parallel Data
TLDR
It is shown that a bilingual dictionary can be built between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way. Expand
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
TLDR
An architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts using a single BiLSTM encoder with a shared byte-pair encoding vocabulary for all languages, coupled with an auxiliary decoder and trained on publicly available parallel corpora. Expand
Unsupervised Neural Machine Translation
TLDR
This work proposes a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingUAL corpora alone using a combination of denoising and backtranslation. Expand
Unsupervised Cross-lingual Word Embedding by Multilingual Neural Language Models
TLDR
The authors' model contains bidirectional LSTMs that perform as forward and backward language models, and these networks are shared among all the languages, so that word embeddings of each language are mapped into a common latent space, making it possible to measure the similarity of words across multiple languages. Expand
Unsupervised Machine Translation Using Monolingual Corpora Only
TLDR
This work proposes a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space and effectively learns to translate without using any labeled data. Expand
Unsupervised Pretraining for Sequence to Sequence Learning
TLDR
This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models by pretraining the weights of the encoder and decoder with the pretrained weights of two language models and then fine-tuned with labeled data. Expand
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
TLDR
This work proposes a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a shared wordpiece vocabulary, and introduces an artificial token at the beginning of the input sentence to specify the required target language. Expand
Zero-Shot Cross-lingual Classification Using Multilingual Neural Machine Translation
TLDR
A simple framework for cross-lingual transfer learning by reusing the encoder from a multilingual NMT system and stitching it with a task-specific classifier component, which can perform classification in a new language for which no classification data was seen during training, showing that zero-shot classification is possible and remarkably competitive. Expand
...
1
2
3
4
5
...