• Publications
  • Influence
Unsupervised Cross-lingual Representation Learning at Scale
TLDR
It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.
Cross-lingual Language Model Pretraining
TLDR
This work proposes two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingsual language model objective.
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
TLDR
It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
Word Translation Without Parallel Data
TLDR
It is shown that a bilingual dictionary can be built between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way.
XNLI: Evaluating Cross-lingual Sentence Representations
TLDR
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.
Phrase-Based & Neural Unsupervised Machine Translation
TLDR
This work investigates how to learn to translate when having access to only large monolingual corpora in each language, and proposes two model variants, a neural and a phrase-based model, which are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters.
What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties
TLDR
10 probing tasks designed to capture simple linguistic features of sentences are introduced and used to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of bothencoders and training methods.
Very Deep Convolutional Networks for Text Classification
TLDR
This work presents a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations, and is able to show that the performance of this model increases with the depth.
SentEval: An Evaluation Toolkit for Universal Sentence Representations
We introduce SentEval, a toolkit for evaluating the quality of universal sentence representations. SentEval encompasses a variety of tasks, including binary and multi-class classification, natural
Very Deep Convolutional Networks for Natural Language Processing
TLDR
This work presents a new architecture for text processing which operates directly on the character level and uses only small convolutions and pooling operations, and is able to show that the performance of this model increases with the depth.
...
1
2
3
...