A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005

  title={A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005},
  author={Huihsin Tseng and Pi-Chuan Chang and Galen Andrew and Daniel Jurafsky and Christopher D. Manning},
  booktitle={SIGHAN@IJCNLP 2005},
We present a Chinese word segmentation system submitted to the closed track of Sighan bakeoff 2005. Our segmenter was built using a conditional random field sequence model that provides a framework to use a large number of linguistic features such as character identity, morphological and character reduplication features. Because our morphological features were extracted from the training corpora automatically, our system was not biased toward any particular variety of Mandarin. Thus, our system… CONTINUE READING

4 Figures & Tables



Citations per Year

457 Citations

Semantic Scholar estimates that this publication has 457 citations based on the available data.

See our FAQ for additional information.