Reassessment of the Role of Phrase Extraction in PBSMT

Abstract

In this paper we study in detail the relation between word alignment and phrase extraction. First, we analyze different word alignments according to several characteristics and compare them to hand-aligned data. Secondly, we analyzed the phrase-pairs generated by these alignments. We observed that the number of unaligned words has a large impact on the characteristics of the phrase table. A manual evaluation of phrase pair quality showed that the increase in the number of unaligned words results in a lower quality. Finally, we present translation results from using the number of unaligned words as features from which we obtain up to 2BP of improvement.

9 Figures and Tables

Cite this paper

@inproceedings{Guzmn2009ReassessmentOT, title={Reassessment of the Role of Phrase Extraction in PBSMT}, author={Francisco Guzm{\'a}n and Qin Gao and Stephan Vogel}, year={2009} }