Scalable Cross-Lingual Transfer of Neural Sentence Embeddings

  title={Scalable Cross-Lingual Transfer of Neural Sentence Embeddings},
  author={Hanan Aldarmaki and Mona T. Diab},
We develop and investigate several cross-lingual alignment approaches for neural sentence embedding models, such as the supervised inference classifier, InferSent, and sequential encoder-decoder models. We evaluate three alignment frameworks applied to these models: joint modeling, representation transfer learning, and sentence mapping, using parallel text to guide the alignment. Our results support representation transfer as a scalable approach for modular cross-lingual alignment of neural… 

Figures and Tables from this paper

Probing Multilingual Sentence Representations With X-Probe
It is discovered that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.
Empirical Study of Sentence Embeddings for English Sentences Quality Assessment
  • Pablo Rivas, Marcus Zimmermann
  • Computer Science
    2019 International Conference on Computational Science and Computational Intelligence (CSCI)
  • 2019
The study suggests that these state-of-the-art sentence embeddings are unable to capture sufficient information regarding sentence correctness and quality in the English language.
  • Computer Science
  • 2019
Cross-lingual knowledge distillation (XD) is introduced, where the same polyglot model is used both as teacher and student across languages to improve its sentence representations without using the end-task labels.


Cross-lingual Models of Word Embeddings: An Empirical Comparison
It is shown that models which require expensive cross-lingual knowledge almost always perform better, but cheaply supervised models often prove competitive on certain tasks.
Learning Cross-lingual Word Embeddings via Matrix Co-factorization
A matrix cofactorization framework for learning cross-lingual word embeddings that explicitly defines monolingual training objectives in the form of matrix decomposition, and induces cross-lingsual constraints for simultaneously factorizing monolingUAL matrices.
Learning Joint Multilingual Sentence Representations with Neural Machine Translation
This paper uses the framework of neural machine translation to learn joint sentence representations across six very different languages, and provides experimental evidence that sentences that are close in embedding space are indeed semantically highly related, but often have quite different structure and syntax.
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
It is shown that bilingual embeddings learned using the proposed BilBOWA model outperform state-of-the-art methods on a cross-lingual document classification task as well as a lexical translation task on WMT11 data.
XNLI: Evaluating Cross-lingual Sentence Representations
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.
Learning Cross-lingual Representations with Matrix Factorization
The results do not beat state-of-the-art performance in these tasks, but it is shown that the crosslingual models are at least as good as their monolingual counterparts.
Context-Aware Cross-Lingual Mapping
An alternative to word-level mapping that better reflects sentence-level cross-lingual similarity is proposed that incorporates context in the transformation matrix by directly mapping the averaged embeddings of aligned sentences in a parallel corpus.
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
Massively Multilingual Word Embeddings
New methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space are introduced and a new evaluation method is shown to correlate better than previous ones with two downstream tasks.