A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005

Abstract

We present a Chinese word segmentation system submitted to the closed track of Sighan bakeoff 2005. Our segmenter was built using a conditional random field sequence model that provides a framework to use a large number of linguistic features such as character identity, morphological and character reduplication features. Because our morphological features… (More)

Topics

4 Figures and Tables

Statistics

0204060'06'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

398 Citations

Semantic Scholar estimates that this publication has 398 citations based on the available data.

See our FAQ for additional information.

Slides referencing similar topics