Learn More
We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar(More)
A Statistical Machine Translation (SMT) system generates an n-best list of candidate translations for each sentence. A model error occurs if the most probable translation (1-best) generated by the SMT decoder is not the most accurate as measured by its similarity to the human reference transla-tion(s) (an oracle). In this paper we investigate the parametric(More)
Nearly all resident proteins of the organelles along the secretory pathway, as well as proteins that are expressed at the cell surface or secreted from the cell, are first co-translationally translocated into the lumen of the endoplasmic reticulum (ER) as unfolded polypeptide chains. Immediately after entering the ER, they are often modified with N-linked(More)
In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. However, there is a need for systematic study as to what alignment characteristics can benefit MT under specific experimental settings such as the type of MT system, the language pair or the type or size of the corpus. In this paper we perform, in(More)
We introduce a bilingually motivated word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Our approach is motivated from the insight that PB-SMT systems can be improved by optimizing the input representation to reduce the predictive power of(More)
We improve translation memory (TM)-inspired consistent phrase-based statistical machine translation (PB-SMT) using rich linguistic information including lexical, part-of-speech, dependency, and semantic role features to predict whether a TM-derived sub-segment should constrain PB-SMT translation. Besides better translation consistency, for(More)