The Use of SVM for Chinese New Word Identification

  title={The Use of SVM for Chinese New Word Identification},
  author={Hongqiao Li and Changning Huang and Jianfeng Gao and Xiaozhong Fan},
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed empirically. In particular, we focus on the new words of two surface patterns, which account for more than 80% of new words in our data sets: NW11 (two-character new word) and NW21 (a bi-character word followed with a single character). NWI is defined as a problem of binary classification. A statistical learning approach… CONTINUE READING
Highly Cited
This paper has 90 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 40 extracted citations

Apply text mining in analysis of patent document

2009 IEEE 10th International Conference on Computer-Aided Industrial Design & Conceptual Design • 2009
View 7 Excerpts
Highly Influenced

Mining Web data for Chinese segmentation

View 1 Excerpt
Highly Influenced

Chinese New Words Extraction Based on Machine Learning Approach

2006 International Conference on Machine Learning and Cybernetics • 2006
View 3 Excerpts
Highly Influenced

Online Detection of Domain-Specific New Words in Text Streams

2018 15th International Conference on Service Systems and Service Management (ICSSSM) • 2018
View 1 Excerpt

On the unsupervised analysis of domain-specific Chinese texts.

Proceedings of the National Academy of Sciences of the United States of America • 2016

Unknown Word Detection in Song Poetry

2016 IEEE First International Conference on Data Science in Cyberspace (DSC) • 2016
View 1 Excerpt

90 Citations

Citations per Year
Semantic Scholar estimates that this publication has 90 citations based on the available data.

See our FAQ for additional information.