Learn More
This paper proposes a new approach to domain adaptation in statistical machine translation (SMT) based on a vector space model (VSM). The general idea is first to create a vector profile for the in-domain development (" dev ") set. This profile might, for instance, be a vector with a di-mensionality equal to the number of training subcorpora; each entry in(More)
A recent paper described a new machine translation evaluation metric, AMBER. This paper describes two changes to AMBER. The first one is incorporation of a new ordering penalty; the second one is the use of the downhill simplex algorithm to tune the weights for the components of AMBER. We tested the impact of the two changes, using data from the WMT metrics(More)
Previous research on domain adaptation (DA) for statistical machine translation (SMT) has mainly focused on the translation model (TM) and the language model (LM). To the best of our knowledge, there is no previous work on reordering model (RM) adaptation for phrase-based SMT. In this paper, we demonstrate that mixture model adaptation of a lexical-ized RM(More)
This paper describes the statistical machine translation system developed at ITC-irst for the evaluation campaign of the International Workshop on Spoken Language Translation 2005. The system exploits two search passes: the first pass is performed by a beam-search decoder which generates an n-best list of translations, the second by a simple re-scoring(More)
This paper proposes the use of rules automatically extracted from word aligned training data to model word reordering phenomena in phrase-based statistical machine translation. Scores computed from matching rules are used as additional feature functions in the rescoring stage of the automatic translation process from various languages to En-glish, in the(More)
This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) for our participation in the IWSLT'08 evaluation campaign. We present N gram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems'(More)
This paper describes how to cluster together the phrases of a phrase-based statistical machine translation (SMT) system , using information in the phrase table itself. The clustering is symmetric and recursive: it is applied both to source-language and target-language phrases, and the clustering in one language helps determine the clustering in the other.(More)