Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment

Abstract

We introduce the task of cross-lingual lexical entailment, which aims to detect whether the meaning of a word in one language can be inferred from the meaning of a word in another language. We construct a gold standard for this task, and propose an unsupervised solution based on distributional word representations. As commonly done in the monolingual setting, we assume a word e entails a word f if the prominent context features of e are a subset of those of f . To address the challenge of comparing contexts across languages, we propose a novel method for inducing sparse bilingual word representations from monolingual and parallel texts. Our approach yields an Fscore of 70%, and significantly outperforms strong baselines based on translation and on existing word representations.

Extracted Key Phrases

7 Figures and Tables

Cite this paper

@inproceedings{Vyas2016SparseBW, title={Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment}, author={Yogarshi Vyas and Marine Carpuat}, booktitle={HLT-NAACL}, year={2016} }