Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus

@inproceedings{Zhang2000ExtractionOC,
  title={Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus},
  author={J. Zhang and Jianfeng Gao and M. Zhou},
  booktitle={ACL 2000},
  year={2000}
}
  • J. Zhang, Jianfeng Gao, M. Zhou
  • Published in ACL 2000
  • Computer Science
  • This paper is to introduce a statistical method to extract Chinese compound words from a very large corpus. This method is based on mutual information and context dependency. Experimental results show that this method is efficient and robust compared with other approaches. We also examined the impact of different parameter settings, corpus size and heterogeneousness on the extraction results. We finally present results on information retrieval to show the usefulness of extracted compounds. 
    55 Citations
    A Study on Multi-word Extraction from Chinese Documents
    • 5
    • PDF
    Semi-supervised Chinese compound word extraction based on HMM
    • H. He, B. Chen, J. Guo
    • Computer Science
    • 2008 7th World Congress on Intelligent Control and Automation
    • 2008
    • 3
    A Method of Automatically Acquiring Concepts in Domain Ontology from Chinese Corpus
    • Yajun Liu, L. Zhai, Lisha Gao
    • Computer Science
    • 2006 10th International Conference on Computer Supported Cooperative Work in Design
    • 2006
    Error feedback based lexical entity extraction for Chinese language modeling
    Automatic Technical Term Extraction Based on Term Association
    • 1
    Chinese Word Segmentation Based on Contextual Entropy
    • 17
    • PDF
    Chinese Word Segmentation Based on Contextual Entropy
    • 8
    • PDF
    Accessor Variety Criteria for Chinese Word Extraction
    • 134
    • PDF

    References

    SHOWING 1-10 OF 12 REFERENCES
    Extracting Key Terms from Chinese and Japanese texts
    • 33
    • PDF
    PAT-tree-based keyword extraction for Chinese information retrieval
    • 279
    • Highly Influential
    Information Technology: The Fifth Text Retrieval Conference(TREC5)
    • Information Technology: The Fifth Text Retrieval Conference(TREC5)
    • 1996
    Information Technology: The Fifth Text Retrieval Conference(TREC5), NIST SP 500-238
    • 1996
    " Large - scale automatic extraction of an English - Chinese lexicon "
    • Machine Translation
    • 1995
    Large-scale automatic extraction of an English-Chinese lexicon
    • Machine Translation
    • 1995
    Large-scale automatic extraction of an English-Chinese lexicon", Machine Translation
    • 1995
    Identification of Unknown Words From a Corpus
    • Compouter Processing of Chinese and Oriental Languages
    • 1994
    Implementation of the SMART Information Retrieval System
    • 366