Corpus Effects on the Evaluation of Automated Transliteration Systems

@inproceedings{Karimi2007CorpusEO,
  title={Corpus Effects on the Evaluation of Automated Transliteration Systems},
  author={Sarvnaz Karimi and Andrew Turpin and Falk Scholer},
  booktitle={ACL},
  year={2007}
}
Most current machine transliteration systems employ a corpus of known sourcetarget word pairs to train their system, and typically evaluate their systems on a similar corpus. In this paper we explore the performance of transliteration systems on corpora that are varied in a controlled way. In particular, we control the number, and prior language knowledge of human transliterators used to construct the corpora, and the origin of the source words that make up the corpora. We find that the word… CONTINUE READING