• Corpus ID: 813729

Syntactic Features for Evaluation of Machine Translation

  title={Syntactic Features for Evaluation of Machine Translation},
  author={Ding Liu and Daniel Gildea},
Automatic evaluation of machine translation, based on computing n-gram similarity between system output and human reference translations, has revolutionized the development of MT systems. We explore the use of syntactic information, including constituent labels and head-modier dependencies, in computing similarity between output and reference. Our results show that adding syntactic information to the evaluation metric improves both sentence-level and corpus-level correlation with human… 

Figures and Tables from this paper

A New Syntactic Metric for Evaluation of Machine Translation
A new syntactic metric for MT evaluation based on the comparison of the dependency structures of the reference and the candidate translations is proposed by means of a Weighted Constraints Dependency Grammar parser.
Using F-structures in machine translation evaluation
It is shown how Lexical-Functional Grammar (LFG) labelled dependencies obtained from an automatic parse can be used to assess the quality of MT on a deeper linguistic level, giving as a result higher correlations with human judgements.
Labelled Dependencies in Machine Translation Evaluation
A dependency-based method for evaluating the quality of Machine Translation output, using labelled dependencies produced by a Lexical-Functional Grammar (LFG) parser, which reaches high correlation with human scores.
Evaluating machine translation with LFG dependencies
This paper shows how labelled dependencies produced by a Lexical-Functional Grammar parser can be used in Machine Translation evaluation, and obtains high correlation with human scores, both on a segment and system level.
Regression for Sentence-Level MT Evaluation with Pseudo References
These metrics are developed using regression learning and are based on a set of weaker indicators of fluency and adequacy, which suggest that they rival standard reference-based metrics in terms of correlations with human judgments on new test instances.
Using parallel treebanks for machine translation evaluation
It is shown that the evaluation accuracy can be increased by using word alignments extracted from a parallel treebank, and the performance of the statistical MT system is compared with two other competitive systems with respect to a set of problematic linguistic structures for translation between German and French.
Evaluation of machine translation with dependent Skip-Ngrams
  • HongPeng Yu, Hongwei Xu
  • Computer Science
    Proceedings of 2012 International Conference on Measurement, Information and Control
  • 2012
An automatic evaluation method for the machine translations system that extends the idea of skip-bigram with unequal length grams and dependency relations and qualifies the grams with linguistic knowledge.
Significance tests of automatic machine translation evaluation metrics
A significance test-driven comparison of n-gram-based automatic MT evaluation metrics is presented and the characteristics of differentMT evaluation metrics and how to construct reliable and efficient evaluation suites are studied.
Using Discourse Structure Improves Machine Translation Evaluation
Two discourse-aware similarity measures are designed, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory and can help improve a number of existing machine translation evaluation metrics both at the segment- and at the system-level.
Source-Language Features and Maximum Correlation Training for Machine Translation Evaluation
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, source-sentence reordering metrics, and discriminative unigram precision, as well as a method of


Confidence Estimation for Machine Translation
A detailed study of confidence estimation for machine translation, using data from the NIST 2003 Chinese-to-English MT evaluation to investigate various methods for determining whether MT output is correct.
Bleu: a Method for Automatic Evaluation of Machine Translation
This work proposes a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
A Path-based Transfer Model for Machine Translation
A path-based transfer model for machine translation that translates a path in the source language dependency tree into a fragment in the target dependency tree and becomes a graph-theoretic problem of finding the minimum path covering of the sourcelanguage dependency tree.
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.
A Smorgasbord of Features for Statistical Machine Translation
We describe a methodology for rapid experimentation in statistical machine translation which we use to add a large number of features to a baseline system exploiting features from a wide range of
Phrasal Cohesion and Statistical Machine Translation
It is demonstrated that while there are cases where coherence is poor, there are many regularities which can be exploited by a statistical machine translation system.
Minimum Error Rate Training in Statistical Machine Translation
It is shown that significantly better results can often be obtained if the final evaluation criterion is taken directly into account as part of the training procedure.
A learning approach to improving sentence-level MT evaluation
A novel method involving classifying translations as machine or humanproduced rather than directly predicting numerical human judgments eliminates the need for labor-intensive user studies as a source of training data and is shown to significantly improve upon current automatic metrics.
Convolution Kernels for Natural Language
It is shown how a kernel over trees can be applied to parsing using the voted perceptron algorithm, and experimental results on the ATIS corpus of parse trees are given.
Head-Driven Statistical Models for Natural Language Parsing
  • M. Collins
  • Computer Science
    Computational Linguistics
  • 2003
Three statistical models for natural language parsing are described, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree.