The Alignment Template Approach to Statistical Machine Translation

  title={The Alignment Template Approach to Statistical Machine Translation},
  author={Franz Josef Och and Hermann Ney},
  journal={Computational Linguistics},
  • F. Och, H. Ney
  • Published 1 December 2004
  • Computer Science
  • Computational Linguistics
A phrase-based statistical machine translation approach the alignment template approach is described. This translation approach allows for general many-to-many relations between words. Thereby, the context of words is taken into account in the translation model, and local changes in word order from source to target language can be learned explicitly. The model is described using a log-linear modeling approach, which is a generalization of the often used source-channel approach. Thereby, the… 

The application of source language information in Chinese-English statistical machine translation

This work argues that unaligned words in the source language cause translation errors such as word deletions and word insertions, and has developed a semi-automatic approach to derive the Chinese category rules from the English categories via word alignment.

An incremental syntactic language model for statistical phrase-based machine translation

A novel technique for incorporating syntactic knowledge as a language model in the context of statistical phrase-based machine translation (Koehn et al., 2003) is presented and empirical results suggest that the use of the incremental syntactic language model is indeed serving to guide the translation algorithm towards more fluent target language translations.

Statistical Approaches to Computer-Assisted Translation

Alignment templates, phrase-based models, and stochastic finite-state transducers are used to develop computer-assisted translation systems in a European project in two real tasks.

Towards integrated machine translation using structural alignment from syntax-augmented synchronous parsing

This paper presents a novel method to connect a re-alignment model with a translation model in an integrated framework based on syntax-augmented synchronous context-free grammar and generates a new word alignment under the structural constraint from the Viterbi tree.

Using Alignment Templates to Infer Shallow-Transfer Machine Translation Rules

This paper describes how to adapt the alignment templates used in statistical machine translation to the rule-based machine translation framework, and shows there is a considerable improvement in the translation quality as compared to word-for-word translation.

Phrase Alignment Models for Statistical Machine Translation

This thesis describes a discriminative model of multi-scale, overlapping phrases that outperforms all previously proposed models of phrase alignment and establishes model-based phrase alignment as the most effective approach to acquiring phrasal translations.

New Statistical And Syntactic Models For Machine Translation

A novel syntax-based approach to handle the fundamental problem of word ordering for SMT exploiting syntactic representations of source and target texts and automatically extracted hierarchical reordering patterns is proposed.

Supervised Word Alignment for Phrase-based Statistical Machine Translation

A discriminative model, conditional random fields (CRF), is applied to solve the word alignment problem, and it is proved that a good alignment result is useful in improving the translation quality in a phrasebased SMT.

Hierarchical Phrase-based Translation Model vs . Classical Phrase-based Translation Model for Spanish-English Statistical Machine Translation System

It is shown that a hierarchical phrase-based translation system will outperform a classical phrase- based system in the Spanish-English translation direction, but for the English-Spanish direction, the classical phrase -based system is preferable.

The ISI/USC MT system

The ISI/USC machine translation system is a statistical system based on a phrase translation model that is trained on bilingual parallel data that uses a log-linear model to combine several different knowledge sources into a direct model of translation.



Grammar Inference and Statistical Machine Translation

A structure-based alignment model is introduced that can directly depict the structure between English and German spoken languages and results in focused learning of word alignment, therefore it can alleviate the sparse data problem.

A DP-based Search Using Monotone Alignments in Statistical Translation

A Dynamic Programming (DP) based search algorithm for statistical translation uses a first-order Hidden Markov model as they are used successfully in speech recognition for the time alignment problem.

A Projection Extension Algorithm for Statistical Machine Translation

A phrase- based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models that has been successfully test on a Chinese-English and an Arabic-English translation task.

Decoding Complexity in Word-Replacement Translation Models

This work shows that for the simplest form of statistical models, this problem is NP-complete, i.e., probably exponential in the length of the observed sentence, and traces this complexity to factors not present in other decoding problems.

Machine Translation with a Stochastic Grammatical Channel

A stochastic grammatical channel model for machine translation, that synthesizes several desirable characteristics of both statistical and grammatical machine translation and achieves significant speed gains over the earlier model.

Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation

A novel technique to restrict the possible word reorderings between source and target language in order to achieve an efficient search algorithm for statistical machine translation based on dynamic programming (DP).

HMM-Based Word Alignment in Statistical Translation

A new model for word alignment in statistical translation using a first-order Hidden Markov model for the word alignment problem as they are used successfully in speech recognition for the time alignment problem.

Word re-ordering and dynamic programming based search algorithm for statistical machine translation

A new search procedure for statistical machine translation (SMT) is proposed that is based on dynamic programming (DP) and a generalization of these re-ordering restrictions is given that is applicable to several different translation directions.

Refined Lexikon Models for Statistical Machine Translation Using a Maximum Entropy Approach

It is shown that it is possible to significantly decrease training and test corpus perplexity of the translation models and perform a rescoring of N-Best lists using the maximum entropy model and thereby yield an improvement in translation quality.

The Mathematics of Statistical Machine Translation: Parameter Estimation

It is reasonable to argue that word-by-word alignments are inherent in any sufficiently large bilingual corpus, given a set of pairs of sentences that are translations of one another.