Syntactic Constraints on Phrase Extraction for Phrase-Based Machine Translation

@inproceedings{Cao2010SyntacticCO,
  title={Syntactic Constraints on Phrase Extraction for Phrase-Based Machine Translation},
  author={Hailong Cao and Andrew M. Finch and Eiichiro Sumita},
  booktitle={SSST@COLING},
  year={2010}
}
A typical phrase-based machine translation (PBMT) system uses phrase pairs extracted from word-aligned parallel corpora. All phrase pairs that are consistent with word alignments are collected. The resulting phrase table is very large and includes many non-syntactic phrases which may not be necessary. We propose to filter the phrase table based on source language syntactic constraints. Rather than filter out all non-syntactic phrases, we only apply syntactic constraints when there is phrase… CONTINUE READING