Error Analysis of Statistical Machine Translation Output

Abstract

Evaluation of automatic translation output is a difficult task. Several performance measures like Word Error Rate, Position Independent Word Error Rate and the BLEU and NIST scores are widely use and provide a useful tool for comparing different systems and to evaluate improvements within a system. However the interpretation of all of these measures is not at all clear, and the identification of the most prominent source of errors in a given system using these measures alone is not possible. Therefore some analysis of the generated translations is needed in order to identify the main problems and to focus the research efforts. This area is however mostly unexplored and few works have dealt with it until now. In this paper we will present a framework for classification of the errors of a machine translation system and we will carry out an error analysis of the system used by the RWTH in the first TC-STAR evaluation.

5 Figures and Tables

02040'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

216 Citations

Semantic Scholar estimates that this publication has 216 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Vilar2006ErrorAO, title={Error Analysis of Statistical Machine Translation Output}, author={David Vilar and Jia Xu and Luis Fernando D'Haro and Hermann Ney}, booktitle={LREC}, year={2006} }