Pushing the Limits of Translation Quality Estimation

@article{Martins2017PushingTL,
  title={Pushing the Limits of Translation Quality Estimation},
  author={Andr{\'e} F. T. Martins and Marcin Junczys-Dowmunt and Fabio Kepler and Ram{\'o}n Fern{\'a}ndez Astudillo and Chris Hokamp and Roman Grundkiewicz},
  journal={Transactions of the Association for Computational Linguistics},
  year={2017},
  volume={5},
  pages={205-218}
}
Translation quality estimation is a task of growing importance in NLP, due to its potential to reduce post-editing human effort in disruptive ways. However, this potential is currently limited by the relatively low accuracy of existing systems. In this paper, we achieve remarkable improvements by exploiting synergies between the related tasks of word-level quality estimation and automatic post-editing. First, we stack a new, carefully engineered, neural model into a rich feature-based word… Expand
deepQuest: A Framework for Neural-based Quality Estimation
TLDR
This work presents a neural framework that is able to accommodate neural QE approaches at these fine-grained levels and generalize them to the level of documents and applies QE models to the output of both statistical and neural MT systems for a series of European languages. Expand
Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation
TLDR
This work presents a novel approach to Automatic Post-Editing (APE) and Word-Level Quality Estimation (QE) using ensembles of specialized Neural Machine Translation (NMT) systems, training a suite of NMT models that use different input representations, but share the same output space. Expand
Contextual Encoding for Translation Quality Estimation
TLDR
This model was submitted as the CMU entry to the WMT2018 shared task on QE, and achieves strong results, ranking first in three of the six tracks. Expand
Multi-task Stack Propagation for Neural Quality Estimation
TLDR
A multi-task stack propagation is proposed, which extensively applies stack propagation to fully train the Predictor-Estimator on a continuous stacking architecture and multi- task learning to enhance the training data from related other-level quality-estimation tasks. Expand
Verdi: Quality Estimation and Error Detection for Bilingual Corpora
TLDR
Verdi is proposed, a novel framework for word-level and sentence-level post-editing effort estimation for bilingual corpora that beats the winner of the competition and outperforms other baseline methods by a great margin. Expand
Automatic Post-Editing for Machine Translation
TLDR
A thorough investigation of APE as a downstream task is performed in order to understand its potential to improve translation quality and advance the core technology - starting from classical methods to recent deep-learning based solutions. Expand
Combining Quality Estimation and Automatic Post-editing to Enhance Machine Translation output
We investigate different strategies for combining quality estimation (QE) and automatic postediting (APE) to improve the output of machine translation (MT) systems. The joint contribution of the twoExpand
Incorporating Syntactic Knowledge in Neural Quality Estimation for Machine Translation
TLDR
Experimental results on WMT17 quality estimation datasets show that the sentence-level Pearson correlation score and the word-level F1–mult score can both be improved by the syntactic knowledge. Expand
Ensemble Distilling Pretrained Language Models for Machine Translation Quality Estimation
TLDR
This paper proposes an effective method to utilize pretrained language models to improve the performance of QE, and combines two popular pretrained models, which are Bert and XLM, to create a very strong baseline for both sentence-level and word-level QE. Expand
Predicting insertion positions in word-level machine translation quality estimation
TLDR
Several feature sets and neural network architectures are explored and evaluated on publicly-available datasets used in previous evaluation campaigns for word-level MT QE and confirm the feasibility of the proposed approach, as well as the usefulness of sharing information between the two prediction tasks in order to obtain more reliable quality estimations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 52 REFERENCES
Exploiting Objective Annotations for Minimising Translation Post-editing Effort
TLDR
It is shown that estimations resulting from using post-editing time, a simple and objective annotation, can minimise translation postediting effort in a practical, task-based scenario. Expand
Referential Translation Machines for Quality Estimation
TLDR
Novel techniques for solving all subtasks in the WMT13 quality estimation (QE) task (QET 2013) based on individual RTM models are developed and improvements over last year’s QE task results are achieved. Expand
SHEF-Lite: When Less is More for Translation Quality Estimation
TLDR
The results give evidence that Gaussian Processes achieve the state of the art performance as a modelling approach for translation quality estimation, and that carefully selecting features and instances for the problem can further improve or at least maintain the same performance levels while making the problem less resource-intensive. Expand
QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation
TLDR
The submitted system combines a continuous space deep neural network, that learns a bilingual feature representation from scratch, with a linear combination of the manually defined baseline features provided by the task organizers, which shows significant improvements over the combined systems. Expand
Log-linear Combinations of Monolingual and Bilingual Neural Machine Translation Models for Automatic Post-Editing
TLDR
The application of neural translation models to the APE problem is explored and good results are achieved by treating different models as components in a log-linear model, allowing for multiple inputs (the MT-output and the source) that are decoded to the same target language (post-edited translations). Expand
Adaptive Quality Estimation for Machine Translation
TLDR
This work proposes an online framework for adaptive QE that targets reactivity and robustness to user and domain changes and demonstrates the effectiveness of this approach in different testing conditions involving user anddomain changes. Expand
Improving Evaluation of Machine Translation Quality Estimation
TLDR
This paper provides an analysis of methods of comparison and proposes the use of the unit-free Pearson correlation, in addition to providing an appropriate method of significance testing improvements over a baseline, to identify areas of concern with respect to widely used measures. Expand
A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION
We define a new, intuitive measure for evaluating machine translation output that avoids the knowledge intensiveness of more meaning-based approaches, and the labor-intensiveness of human judgments.Expand
A Study of Translation Edit Rate with Targeted Human Annotation
TLDR
A new, intuitive measure for evaluating machine-translation output that avoids the knowledge intensiveness of more meaning-based approaches, and the labor-intensiveness of human judgments is examined, which indicates that HTER correlates with human judgments better than HMETEOR and that the four-reference variants of TER and HTER correlate withhuman judgments as well as—or better than—a second human judgment does. Expand
UAlacant word-level machine translation quality estimation system at WMT 2015
TLDR
The Universitat d’Alacant submissions for the machine translation quality estimation (MTQE) shared task in WMT 2015 is described, where they participated in the wordlevel MTQE sub-task. Expand
...
1
2
3
4
5
...