Document-level Re-ranking with Soft Lexical and Semantic Features for Statistical Machine Translation

@inproceedings{Ding2014DocumentlevelRW,
  title={Document-level Re-ranking with Soft Lexical and Semantic Features for Statistical Machine Translation},
  author={Chenchen Ding},
  year={2014}
}
We introduce two document-level features to polish baseline sentence-level translations generated by a state-of-the-art statistical machine translation (SMT) system. One feature uses the word-embedding technique to model the relation between a sentence and its context on the target side; the other feature is a crisp document-level token-type ratio of target-side translations for source-side words to model the lexical consistency in translation. The weights of introduced features are tuned to… CONTINUE READING