Learn More
One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TFÃIDF, LSI and multi-word for text representation. We used a Chinese and an English document(More)
One of the deficiencies of mutual information is its poor capacity to measure association of words with unsymmetrical co-occurrence, which has large amounts for multi-word expression in texts. Moreover, threshold setting, which is decisive for success of practical implementation of mutual information for multi-word extraction, brings about many parameters(More)
Inspired by the ideas of Swarm Intelligence and the " global brain " , a concept of " community intelligence " is suggested in the present paper, reflecting that some " intelligent " features may emerge in a Web-mediated online community from interactions and knowledge-transmissions between the community members. This possible research field of community(More)
—Text representation, which is a fundamental and necessary process for text-based intelligent information processing, includes the tasks of determining the index terms for documents and producing the numeric vectors corresponding to the documents. In this paper, multi-word, which is regarded as containing more contextual semantics than individual word and(More)
In order to extract multi-words from documents, mutual information (MI), as a statistical method, is the most popular solution under consideration. However, there are two kinds of deficiencies inherent in MI. One is the problem of unilateral co-occurrence, and the other is rare occurrence problem. To attack these two problems, augmented mutual information(More)
Therapeutic hypothermia is recommended for moderate and severe neonatal encephalopathy, but is being applied to a wider range of neonates than originally envisaged. To examine the clinical use of therapeutic hypothermia, data collected during the first 3 years (2012-2014) of the Baby Cooling Registry of Japan were analysed. Of 485 cooled neonates, 96.5%(More)
In this paper, we proposed a new approach using ontology to improve precision of terminology extraction from documents. Firstly, a linguistic method was used to extract the terminological patterns from documents. Then, similarity measures within the framework of ontology were employed to rank the semantic dependency of the noun words in a pattern. Finally,(More)