Vector Space Model for Adaptation in Statistical Machine Translation
@inproceedings{Chen2013VectorSM,
title={Vector Space Model for Adaptation in Statistical Machine Translation},
author={Boxing Chen and Roland Kuhn and George F. Foster},
booktitle={ACL},
year={2013}
}
This paper proposes a new approach to domain adaptation in statistical machine translation (SMT) based on a vector space model (VSM). The general idea is first to create a vector profile for the in-domain development (“dev”) set. This profile might, for instance, be a vector with a dimensionality equal to the number of training subcorpora; each entry in the vector reflects the contribution of a particular subcorpus to all the phrase pairs that can be extracted from the dev set. Then, for each… CONTINUE READING
Experiments on large scale NIST evaluation data show improvements over strong baselines: +1.8 BLEU on Arabic to English and +1.4 BLEU on Chinese to English over a non-adapted baseline, and significant improvements in most circumstances over baselines with linear mixture model adaptation.