LAMB: A Good Shepherd of Morphologically Rich Languages

@inproceedings{Ebert2016LAMBAG,
  title={LAMB: A Good Shepherd of Morphologically Rich Languages},
  author={Sebastian Ebert and Thomas M{\"u}ller and Hinrich Sch{\"u}tze},
  booktitle={EMNLP},
  year={2016}
}
This paper introduces STEM and LAMB, embeddings trained for stems and lemmata instead of for surface forms. For morphologically rich languages, they perform significantly better than standard embeddings on word similarity and polarity evaluations. On a new WordNet-based evaluation, STEM and LAMB are up to 50% better than standard embeddings. We show that both embeddings have high quality even for small dimensionality and training corpora.