Vector Space Model for Adaptation in Statistical Machine Translation

  title={Vector Space Model for Adaptation in Statistical Machine Translation},
  author={Boxing Chen and Roland Kuhn and George F. Foster},
This paper proposes a new approach to domain adaptation in statistical machine translation (SMT) based on a vector space model (VSM). The general idea is first to create a vector profile for the in-domain development (“dev”) set. This profile might, for instance, be a vector with a dimensionality equal to the number of training subcorpora; each entry in the vector reflects the contribution of a particular subcorpus to all the phrase pairs that can be extracted from the dev set. Then, for each… CONTINUE READING
Highly Cited
This paper has 89 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • Experiments on large scale NIST evaluation data show improvements over strong baselines: +1.8 BLEU on Arabic to English and +1.4 BLEU on Chinese to English over a non-adapted baseline, and significant improvements in most circumstances over baselines with linear mixture model adaptation.


Publications citing this paper.
Showing 1-10 of 36 extracted citations

A Probabilistic Feature-Based Fill-up for SMT

View 6 Excerpts
Method Support
Highly Influenced

LIMSI $@$ WMT'14 Medical Translation Task

WMT@ACL • 2014
View 4 Excerpts
Method Support
Highly Influenced

WMT ’ 14 Medical Translation Task

View 4 Excerpts
Method Support
Highly Influenced

A survey of domain adaptation for statistical machine translation

Machine Translation • 2017
View 8 Excerpts
Method Support
Highly Influenced

Adapting to All Domains at Once: Rewarding Domain Invariance in SMT

Transactions of the Association for Computational Linguistics • 2016
View 4 Excerpts
Highly Influenced

Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation

IEEE/ACM Transactions on Audio, Speech, and Language Processing • 2018
View 1 Excerpt

90 Citations

Citations per Year
Semantic Scholar estimates that this publication has 90 citations based on the available data.

See our FAQ for additional information.