• Publications
  • Influence
Some Effective Techniques for Naive Bayes Text Classification
While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem. Based on the observation of naive Bayes for theExpand
  • 388
  • 17
  • PDF
A practical hypertext catergorization method using links and incrementally available class information
As WWW grows at an increasing speed, a classifier targeted at hypertext has become in high demand. While document categorization is quite a mature, the issue of utilizing hypertext structure andExpand
  • 191
  • 12
  • PDF
Text genre classification with genre-revealing and subject-revealing features
Subject or prepositional content has been the focus of most classification research. Genre or style, on the other hand, is a different and important property of text, and automatic text genreExpand
  • 138
  • 8
  • PDF
Overview of CLIR Task at the Third NTCIR Workshop
Department of Library and Information Science, National Taiwan University Taipei 10617, Taiwan khchen@ccms.ntu.edu.tw Department of Computer Science and Information Engineering, National TaiwanExpand
  • 29
  • 8
  • PDF
PTE: Enumerating Trillion Triangles On Distributed Systems
How can we enumerate triangles from an enormous graph with billions of vertices and edges? Triangle enumeration is an important task for graph data analysis with many applications includingExpand
  • 35
  • 7
  • PDF
Automatic identification and back-transliteration of foreign words for information retrieval
Many foreign words and English words appear in Korean texts, especially in the areas of science and engineering. We recognize two issues related to foreign words, which should be addressed forExpand
  • 84
  • 7
Automatic construction of a large-scale situation ontology by mining how-to instructions from the web
With the growing interests in semantic web services and context-aware computing, the importance of ontologies, which enable us to perform context-aware reasoning, has been accepted widely. WhileExpand
  • 52
  • 7
  • PDF
Overview of CLIR Task at the Fourth NTCIR Workshop
The purpose of this paper is to overview research efforts at the NTCIR-6 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese,Expand
  • 121
  • 6
  • PDF
Automatic Extraction of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing
This study investigated how effectively cause-effect information can be extracted from newspaper text using a simple computational method (i.e. without knowledge-based inferencing and without fullExpand
  • 92
  • 6
  • PDF
Unsupervised word sense disambiguation using WordNet relatives
This paper describes a sense disambiguation method for a polysemous target noun using the context words surrounding the target noun and its WordNet relatives, such as synonyms, hypernyms andExpand
  • 53
  • 6
  • PDF