Black-Box/Glass-Box Evaluation in Shiraz

Abstract

The Shiraz project included an evaluation component: two ‘glass-box’ evaluations have been performed during the project as well as a black-box evaluation at the end of the project. The evaluations were based on the use of a bilingual tagged test corpus of 3000 sentences. Evaluation tools were developed in order to automate the evaluation process. The glass-box evaluations included the evaluation of components of the MT system, and in particular the Persian morphological analyzer, the dictionary and the parser. The evaluation of the translations themselves (black-box evaluations) were performed manually on a subset of the test corpus. This paper outlines the problems encountered in trying to use these evaluations for development and testing purposes as well as traditional ‘off-line’ evaluations.

Cite this paper

@inproceedings{Zajac1998BlackBoxGlassBoxEI, title={Black-Box/Glass-Box Evaluation in Shiraz}, author={R{\'e}mi Zajac and Steve Helmreich and Karine Megerdoomian}, year={1998} }