A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005


We present a Chinese word segmentation system submitted to the closed track of Sighan bakeoff 2005. Our segmenter was built using a conditional random field sequence model that provides a framework to use a large number of linguistic features such as character identity, morphological and character reduplication features. Because our morphological features… (More)


4 Figures and Tables


Citations per Year

398 Citations

Semantic Scholar estimates that this publication has 398 citations based on the available data.

See our FAQ for additional information.

Slides referencing similar topics