Improving Statistical Machine Translation with Word Class Models

  title={Improving Statistical Machine Translation with Word Class Models},
  author={Joern Wuebker and Stephan Peitz and Felix Rietig and Hermann Ney},
Automatically clustering words from a monolingual or bilingual training corpus into classes is a widely used technique in statistical natural language processing. We present a very simple and easy to implement method for using these word classes to improve translation quality. It can be applied across different machine translation paradigms and with arbitrary types of models. We show its efficacy on a small German→English and a larger French→German translation task with both standard phrase… CONTINUE READING
Highly Cited
This paper has 47 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • Our results show that with word class models, the baseline can be improved by up to 1.4% BLEU and 1.0% TER on the French→German task and 0.3% BLEU and 1.1% TER on the German→English task.

Similar Papers

Loading similar papers…