Improving statistical machine translation by classifying and generalizing inflected verb forms

Abstract

This paper introduces a rule-based classification of single-word and compound verbs into a statistical machine translation approach. By substituting verb forms by the lemma of their head verb, the data sparseness problem caused by highly-inflected languages can be successfully addressed. On the other hand, the information of seen verb forms can be used to… (More)

Topics

4 Figures and Tables