Semi-supervised Chinese Word Segmentation based on Bilingual Information

  title={Semi-supervised Chinese Word Segmentation based on Bilingual Information},
  author={Wei Chen and Bo Xu},
This paper presents a bilingual semisupervised Chinese word segmentation (CWS) method that leverages the natural segmenting information of English sentences. The proposed method involves learning three levels of features, namely, character-level, phrase-level and sentence-level, provided by multiple submodels. We use a sub-model of conditional random fields (CRF) to learn monolingual grammars, a sub-model based on character-based alignment to obtain explicit segmenting knowledge, and another… CONTINUE READING
1 Citations
28 References
Similar Papers


Publications referenced by this paper.
Showing 1-10 of 28 references

Similar Papers

Loading similar papers…