Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus
@inproceedings{Zhang2000ExtractionOC, title={Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus}, author={J. Zhang and Jianfeng Gao and M. Zhou}, booktitle={ACL 2000}, year={2000} }
This paper is to introduce a statistical method to extract Chinese compound words from a very large corpus. This method is based on mutual information and context dependency. Experimental results show that this method is efficient and robust compared with other approaches. We also examined the impact of different parameter settings, corpus size and heterogeneousness on the extraction results. We finally present results on information retrieval to show the usefulness of extracted compounds.
Figures, Tables, and Topics from this paper
55 Citations
Semi-supervised Chinese compound word extraction based on HMM
- Computer Science
- 2008 7th World Congress on Intelligent Control and Automation
- 2008
- 3
A Method of Automatically Acquiring Concepts in Domain Ontology from Chinese Corpus
- Computer Science
- 2006 10th International Conference on Computer Supported Cooperative Work in Design
- 2006
Error feedback based lexical entity extraction for Chinese language modeling
- Computer Science
- 2013 6th International Congress on Image and Signal Processing (CISP)
- 2013
Automatic Technical Term Extraction Based on Term Association
- Computer Science
- 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery
- 2008
- 1
Accessor Variety Criteria for Chinese Word Extraction
- Computer Science
- Computational Linguistics
- 2004
- 134
- PDF
References
SHOWING 1-10 OF 12 REFERENCES
PAT-tree-based keyword extraction for Chinese information retrieval
- Computer Science
- SIGIR '97
- 1997
- 279
- Highly Influential
Information Technology: The Fifth Text Retrieval Conference(TREC5)
- Information Technology: The Fifth Text Retrieval Conference(TREC5)
- 1996
Information Technology: The Fifth Text Retrieval Conference(TREC5), NIST SP 500-238
- 1996
" Large - scale automatic extraction of an English - Chinese lexicon "
- Machine Translation
- 1995
Large-scale automatic extraction of an English-Chinese lexicon
- Machine Translation
- 1995
Large-scale automatic extraction of an English-Chinese lexicon", Machine Translation
- 1995
Identification of Unknown Words From a Corpus
- Compouter Processing of Chinese and Oriental Languages
- 1994
Corpus-based Automatic Compound Extraction with Mutual Information and Relative Frequency Count
- Computer Science
- ROCLING
- 1993
- 23
- PDF