• Publications
  • Influence
Findings of the 2012 Workshop on Statistical Machine Translation
TLDR
A large-scale manual evaluation of 103 machine translation systems submitted by 34 teams was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics. Expand
(Meta-) Evaluation of Machine Translation
TLDR
An extensive human evaluation was carried out not only to rank the different MT systems, but also to perform higher-level analysis of the evaluation process, revealing surprising facts about the most commonly used methodologies. Expand
Findings of the 2013 Workshop on Statistical Machine Translation
We present the results of the WMT13 shared tasks, which included a translation task, a task for run-time estimation of machine translation quality, and an unofficial metrics task. This year, 143Expand
Findings of the 2014 Workshop on Statistical Machine Translation
This paper presents the results of the WMT14 shared tasks, which included a standard news translation task, a separate medical translation task, a task for run-time estimation of machine translationExpand
Findings of the 2011 Workshop on Statistical Machine Translation
TLDR
The WMT11 shared tasks, which included a translation task, a system combination task, and a task for machine translation evaluation metrics, show how strongly automatic metrics correlate with human judgments of translation quality for 21 evaluation metrics. Expand
Further Meta-Evaluation of Machine Translation
TLDR
This paper analyzes the translation quality of machine translation systems for 10 language pairs translating between Czech, English, French, German, Hungarian, and Spanish and uses the human judgments of the systems to analyze automatic evaluation metrics for translation quality. Expand
Findings of the 2018 Conference on Machine Translation (WMT18)
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018. Participants were asked to build machine translation systems for anyExpand
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation
TLDR
A large-scale manual evaluation of 104 machine translation systems and 41 system combination entries was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 26 metrics. Expand
Findings of the 2009 Workshop on Statistical Machine Translation
TLDR
A large-scale manual evaluation of 87 machine translation systems and 22 system combination entries is conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality, for more than 20 metrics. Expand
Findings of the 2016 Conference on Machine Translation
This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasksExpand
...
1
2
3
4
5
...