Learning principled bilingual mappings of word embeddings while preserving monolingual invariance

@inproceedings{Artetxe2016LearningPB,
  title={Learning principled bilingual mappings of word embeddings while preserving monolingual invariance},
  author={Mikel Artetxe and Gorka Labaka and Eneko Agirre},
  booktitle={EMNLP},
  year={2016}
}
Mapping word embeddings of different languages into a single space has multiple applications. In order to map from a source space into a target space, a common approach is to learn a linear mapping that minimizes the distances between equivalences listed in a bilingual dictionary. In this paper, we propose a framework that generalizes previous work, provides an efficient exact method to learn the optimal linear transformation and yields the best bilingual results in translation induction while… Expand

Tables and Topics from this paper

Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion
TLDR
This paper proposes an unified formulation that directly optimizes a retrieval criterion in an end-to-end fashion for word translation, and shows that this approach outperforms the state of the art on word translation. Expand
Improving Supervised Bilingual Mapping of Word Embeddings
TLDR
This work proposes to use a retrieval criterion instead of the square loss for learning the mapping of continuous word representations, and shows that this loss function leads to state-of-the-art results, with the biggest improvements observed for distant language pairs such as English-Chinese. Expand
Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations
TLDR
A multi-step framework of linear transformations that generalizes a substantial body of previous work is proposed that allows new insights into the behavior of existing methods, including the effectiveness of inverse regression, and design a novel variant that obtains the best published results in zero-shot bilingual lexicon extraction. Expand
A Locally Linear Procedure for Word Translation
TLDR
This work proposes a natural extension of the Orthogonal Procrustes Analysis algorithm that uses multiple orthogonal translation matrices to model the mapping and derive an algorithm to learn these multiple matrices and shows how multipleMatrices can model multiple senses of a word. Expand
Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach
TLDR
This work proposes a novel geometric approach for learning bilingual mappings given monolingual embeddings and a bilingual dictionary that outperforms previous approaches on the bilingual lexicon induction and cross-lingual word similarity tasks. Expand
Improving Japanese-English Bilingual Mapping of Word Embeddings based on Language Specificity
TLDR
This paper focuses on learning Japanese-English bilingual word embedding mapping by considering the specificity of Japanese language, and proposes an advanced method for improving bilingualword embeddings by adding a language-specific mapping. Expand
Learning bilingual word embeddings with (almost) no bilingual data
TLDR
This work further reduces the need of bilingual resources using a very simple self-learning approach that can be combined with any dictionary-based mapping technique, and works with as little bilingual evidence as a 25 word dictionary or even an automatically generated list of numerals. Expand
Density Matching for Bilingual Word Embedding
TLDR
This paper proposes an approach that expresses the two monolingual embedding spaces as probability densities defined by a Gaussian mixture model, and matches the two densities using a method called normalizing flow, and argues that this formulation has several intuitively attractive properties. Expand
NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings
TLDR
This work proposes a method for learning neighborhood sensitive maps, NORMA, and shows that NORMA outperforms current state-of-the-art methods for word translation between distant languages. Expand
SPMM: A Soft Piecewise Mapping Model for Bilingual Lexicon Induction
TLDR
A Soft Piecewise Mapping Model (SPMM), which generates word alignments in two languages by learning multiple mapping matrices with orthogonal constraint, is proposed and experiments show that SPMM is effective and outperforms previous methods. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 15 REFERENCES
Bilingual Word Embeddings for Phrase-Based Machine Translation
TLDR
A method to learn bilingual embeddings from a large unlabeled corpus, while utilizing MT word alignments to constrain translational equivalence is proposed, which significantly out-perform baselines in word semantic similarity. Expand
Bilingual Word Representations with Monolingual Quality in Mind
TLDR
This work proposes a joint model to learn word representations from scratch that utilizes both the context coocurrence information through the monolingual component and the meaning equivalent signals from the bilingual constraint to learn high quality bilingual representations efficiently. Expand
Deep Multilingual Correlation for Improved Word Embeddings
TLDR
Deep non-linear transformations of word embeddings of the two languages are learned, using the recently proposed deep canonical correlation analysis, to improve their quality and consistency on multiple word and bigram similarity tasks. Expand
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
TLDR
It is shown that bilingual embeddings learned using the proposed BilBOWA model outperform state-of-the-art methods on a cross-lingual document classification task as well as a lexical translation task on WMT11 data. Expand
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation
TLDR
A solution which normalizes the word vectors on a hypersphere and constrains the linear transform as an orthogonal transform and can offer better performance on a word similarity task and an English-toSpanish word translation task is proposed. Expand
Simple task-specific bilingual word embeddings
TLDR
A simple wrapper method that uses off-the-shelf word embedding algorithms to learn task-specific bilingual word embeddings that is independent of the choice of embedding algorithm, does not require parallel data, and can be adapted to specific tasks by re-defining the equivalence classes. Expand
Learning Bilingual Word Representations by Marginalizing Alignments
TLDR
A probabilistic model that simultaneously learns alignments and distributed representations for bilingual data captures a larger semantic context than prior work relying on hard alignments by marginalizing over word alignments. Expand
An Autoencoder Approach to Learning Bilingual Word Representations
TLDR
This work explores the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments, and achieves state-of-the-art performance. Expand
Minimally-Constrained Multilingual Embeddings via Artificial Code-Switching
TLDR
A method is presented that consumes a large corpus of multilingual text and produces a single, unified word embedding in which the word vectors generalize across languages, and which is agnostic about the languages with which the documents in the corpus are expressed. Expand
Improving Vector Space Word Representations Using Multilingual Correlation
TLDR
This paper argues that lexico-semantic content should additionally be invariant across languages and proposes a simple technique based on canonical correlation analysis (CCA) for incorporating multilingual evidence into vectors generated monolingually. Expand
...
1
2
...