Cross-lingual Models of Word Embeddings: An Empirical Comparison

@article{Upadhyay2016CrosslingualMO,
  title={Cross-lingual Models of Word Embeddings: An Empirical Comparison},
  author={Shyam Upadhyay and Manaal Faruqui and Chris Dyer and Dan Roth},
  journal={ArXiv},
  year={2016},
  volume={abs/1604.00425}
}
Despite interest in using cross-lingual knowledge to learn word embeddings for various tasks, a systematic comparison of the possible approaches is lacking in the literature. [] Key Result We show that models which require expensive cross-lingual knowledge almost always perform better, but cheaply supervised models often prove competitive on certain tasks.

Figures and Tables from this paper

A Survey of Cross-lingual Word Embedding Models
TLDR
A comprehensive typology of cross-lingual word embedding models is provided, showing that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent modulo optimization strategies, hyper-parameters, and such.
A survey of cross-lingual embedding models
TLDR
This work surveys models that seek to learn cross-lingual embeddings and discusses them based on the type of approach and the nature of parallel data that they employ.
On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
TLDR
An extensive evaluation over multiple cross-lingual embedding models, analyzing their strengths and limitations with respect to different variables such as target language, training corpora and amount of supervision puts in doubt the view that high-quality cross-lingsual embeddings can always be learned without much supervision.
A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments
TLDR
It is suggested that adding additional sources of information, which go beyond the traditional signal of bilingual sentence-aligned corpora, may substantially improve cross-lingual word embeddings, and that future baselines should at least take such features into account.
Improving Cross-Lingual Word Embeddings by Meeting in the Middle
TLDR
This work proposes to apply an additional transformation after the initial alignment step, which moves cross-lingual synonyms towards a middle point between them, and aims to obtain a better cross-lingsual integration of the vector spaces.
Scalable Cross-Lingual Transfer of Neural Sentence Embeddings
TLDR
The results support representation transfer as a scalable approach for modular cross-lingual alignment of neural sentence embeddings, where it observes better performance compared to joint models in intrinsic and extrinsic evaluations, particularly with smaller sets of parallel data.
The Limitations of Cross-language Word Embeddings Evaluation
TLDR
It is concluded that the use of human references as ground truth for cross-language word embeddings is not proper unless one does not understand how do native speakers process semantics in their cognition.
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions
TLDR
It is empirically demonstrate that the performance of CLE models largely depends on the task at hand and that optimizing CLE models for BLI may hurt downstream performance, and indicates the most robust supervised and unsupervised CLE models.
Concatenated p-mean Word Embeddings as Universal Cross-Lingual Sentence Representations
TLDR
It is shown that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually.
Learning Cross-Lingual Word Embeddings from Twitter via Distant Supervision
TLDR
This paper exploits noisy user-generated text to learn cross-lingual embeddings particularly tailored towards social media applications and finds that it also provides key opportunities due to the abundance of code-switching and the existence of a shared vocabulary of emoji and named entities.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 48 REFERENCES
Trans-gram, Fast Cross-lingual Word-embeddings
TLDR
Trans-gram is introduced, a simple and computationally-efficient method to simultaneously learn and align wordembeddings for a variety of languages, using only monolingual data and a smaller set of sentence-aligned data, and shows that some linguistic features are aligned across languages for which the authors do not have aligned data.
Massively Multilingual Word Embeddings
TLDR
New methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space are introduced and a new evaluation method is shown to correlate better than previous ones with two downstream tasks.
Deep Multilingual Correlation for Improved Word Embeddings
TLDR
Deep non-linear transformations of word embeddings of the two languages are learned, using the recently proposed deep canonical correlation analysis, to improve their quality and consistency on multiple word and bigram similarity tasks.
Inverted indexing for cross-lingual NLP
TLDR
A novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia that enables multi-source crosslingual learning and improves over using state-of-the-art bilingual embeddings.
Simple task-specific bilingual word embeddings
TLDR
A simple wrapper method that uses off-the-shelf word embedding algorithms to learn task-specific bilingual word embeddings that is independent of the choice of embedding algorithm, does not require parallel data, and can be adapted to specific tasks by re-defining the equivalence classes.
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
TLDR
It is shown that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%.
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
TLDR
It is shown that bilingual embeddings learned using the proposed BilBOWA model outperform state-of-the-art methods on a cross-lingual document classification task as well as a lexical translation task on WMT11 data.
Bilingual Word Representations with Monolingual Quality in Mind
TLDR
This work proposes a joint model to learn word representations from scratch that utilizes both the context coocurrence information through the monolingual component and the meaning equivalent signals from the bilingual constraint to learn high quality bilingual representations efficiently.
Polyglot: Distributed Word Representations for Multilingual NLP
TLDR
This work quantitatively demonstrates the utility of word embeddings by using them as the sole features for training a part of speech tagger for a subset of these languages and investigates the semantic features captured through the proximity of word groupings.
Cross-lingual Dependency Parsing Based on Distributed Representations
TLDR
This paper provides two algorithms for inducing cross-lingual distributed representations of words, which map vocabularies from two different languages into a common vector space and bridges the lexical feature gap by using distributed feature representations and their composition.
...
1
2
3
4
5
...