• Corpus ID: 11469421

Producing Translation Tables by Separate N-grams Subtables

  title={Producing Translation Tables by Separate N-grams Subtables},
  author={Juan Luo and Jing Sun and Y. Lepage},
By investigating the distribution of phrase pairs in translation tables, this paper describes an approach to expand the number of n-gram alignments in translation tables output by the sampling-based alignment method. Translation subtables are produced to increase the number of ngrams. Standard normal time distribution is used to adapt the distribution of n-grams in translation tables and leads to better evaluation results than the original approach. Merging translation table of the sampling… 

Tables from this paper



Using Collocation Segmentation to Augment the Phrase Table

The 2010 phrase-based statistical machine translation system developed at the TALP Research Center of the UPC in cooperation with BMIC and VMU is described and the configurations considered are considered and results obtained with internal and official test sets are reported.

Bleu: a Method for Automatic Evaluation of Machine Translation

This work proposes a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

Minimum Error Rate Training in Statistical Machine Translation

It is shown that significantly better results can often be obtained if the final evaluation criterion is taken directly into account as part of the training procedure.

HMM-Based Word Alignment in Statistical Translation

A new model for word alignment in statistical translation using a first-order Hidden Markov model for the word alignment problem as they are used successfully in speech recognition for the time alignment problem.

A Systematic Comparison of Various Statistical Alignment Models

An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models.

The Mathematics of Statistical Machine Translation: Parameter Estimation

It is reasonable to argue that word-by-word alignments are inherent in any sufficiently large bilingual corpus, given a set of pairs of sentences that are translations of one another.

Moses: Open Source Toolkit for Statistical Machine Translation

We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c)

SRILM - an extensible language modeling toolkit

The functionality of the SRILM toolkit is summarized and its design and implementation is discussed, highlighting ease of rapid prototyping, reusability, and combinability of tools.

Parallel Implementations of Word Alignment Tool

Two parallel implementations of GIZA++ that accelerate this word alignment process by showing a near-linear speed-up according to the number of CPUs used, and alignment quality is preserved.

Samplingbased multilingual alignment

  • International Conference on Recent Advances in Natural Language Processing (RANLP 2009)
  • 2009