• Corpus ID: 6308240

Machine Translation Evaluation: A Survey

  title={Machine Translation Evaluation: A Survey},
  author={Lifeng Han and Derek F. Wong},
This paper introduces the state-of-the-art machine translation (MT) evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. The advanced human assessments include task-oriented measures, post-editing, segment ranking, and extended criteriea, etc. We classify the automatic evaluation methods into two categories, including lexical… 

Integrating Meaning into Quality Evaluation of Machine Translation

The results of two experiments confirm the benefit of meaning related features in predicting human evaluation of translation quality in addition to traditional metrics which focus mainly on form.

Fine-grained human evaluation of an English to Croatian hybrid machine translation system

This research compares two approaches to statistical machine translation - pure phrasebased and factored phrase-based - by performing a fine-grained manual evaluation via error annotation of the

Evaluation of Text Generation: A Survey

This paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models.

Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages

The efficiencies of translation using bidirectional encoder attention decoder models were studied with respect to translation involving morphologically rich languages and the use of morphological segmentation improved the efficacy of the attention mechanism.

Extending a model for ontology-based Arabic-English machine translation

  • N. A. DahanF. Ba-Alwi
  • Computer Science
    International Journal of Artificial Intelligence & Applications
  • 2019
This research is to extend a model for the ontology-based Arabic-English Machine Translation, named NAN, which simulate the human way in translation, and results show that NAN translation is approximately more similar to the Human Translation than the other instant translators.

Machine-Translation History and Evolution: Survey for Arabic-English Translations

This research is going to contribute to the Machine-Translation area by helping future researchers to have a summary for the Machine -Translation groups of research and to let lights on the importance of the translation mechanism.

Preference learning for machine translation

Algorithms that can learn from very large amounts of data by exploiting pairwise preferences defined over competing translations are developed, which can be used to make a machine translation system robust to arbitrary texts from varied sources, but also enable it to learn effectively to adapt to new domains of data.

Algorithmes bio-inspirés pour la traduction automatique statistique. (Bio-inspired Algorithms for Statistical Machine Translation)

Differentes composantes des systemes de traduction automatique statistique sont considerees comme des problemes d'optimisations. En effet, l'apprentissage du modele de traduction, le decodage et

Utjecaj višejezičnosti vrednovatelja na ljudsku procjenu kvalitete strojnih prijevoda

U ovom se radu predstavlja istraživanje o utjecaju višejezičnosti vrednovatelja na subjektivnu metodu vrednovanja kvalitete strojnih prijevoda. Subjektivnost ove metode najčešće se očituje u niskim

Exploratory visual text analytics in the scientific literature domain

xi Zusammenfassung xiii



Unsupervised Quality Estimation Model for English to German Translation and Its Application in Extensive Supervised Evaluation

An unsupervised MT evaluation metric using universal part-of-speech tagset without relying on reference translations is proposed, which shows that the designed methods yield higher correlation scores with human judgments.

Phrase-Based Evaluation for Machine Translation

High-level abstract information such as semantic similarity and topic model are introduced into this phrase-based evaluation metric, which achieves comparable correlation with human judgements at segment-level and significant higher correlation at document-level.

A task-oriented evaluation metric for machine translation

The methodology correlates the recorded subjective judgments of the raters in the DARPA MT Evaluation with users' performances on the task-based exercises, and includes a sample inventory of the tasks for which translated material is used, which describes exercises in which users perform each task with MT output.

Combining Confidence Estimation and Reference-based Metrics for Segment-level MT Evaluation

This work describes an effort to improve standard reference-based metrics for Machine Translation evaluation by enriching them with Confidence Estimation features and using a learning mechanism trained on human annotations to provide MT evaluation metrics that achieve higher correlation with human judgments at the segment level.

LEPOR: An Augmented Machine Translation Evaluation Metric

Novel MT evaluation methods are designed where weighting of factors can be optimised according to the characteristics of languages and concise linguistic feature using POS is designed to show that the methods can yield even higher performance when using some external linguistic resources.

Predicting Machine Translation Adequacy

This paper proposes a number of indicators contrasting the source and translation texts to predict the adequacy of such translations at the sentence-level, and shows that these indicators can yield improvements over previous work using general quality indicators based on source complexity and target fluency.

Fully Automatic Semantic MT Evaluation

This work introduces the first fully automatic, fully semantic frame based MT evaluation metric, MEANT, that outperforms all other commonly used automatic metrics in correlating with human judgment on translation adequacy and demonstrates that performing the semantic frame alignment automatically actually tends to be just as good as performing it manually.

Regression and Ranking based Optimisation for Sentence Level MT Evaluation

This paper introduces a new trained metric, ROSE, which only uses simple features that are easy portable and quick to compute, and which still holds when ROSE is trained on human judgements of translations into a different language compared with that use in testing.

Task-based evaluation for machine translation

The ordering of tasks according to their tolerance for errors, as determined by actual task outcomes provided in this paper, is the basis of a scale and repeatable process by which to measure MT systems that has advantages over previous methods.

RED: A Reference Dependency Based MT Evaluation Metric

A novel dependency-based evaluation metric which only employs the dependency information of the references, which gets the state-of-the-art performance which is better than METEOR and SEMPOS on system level, and is comparable withMETEOR on sentence level on WMT 2012 and WMT 2013.