Centroid-Based Document Classification: Analysis and Experimental Results

  title={Centroid-Based Document Classification: Analysis and Experimental Results},
  author={Eui-Hong Han and George Karypis},
In recent years we have seen a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and company-wide intrane ts. Automatic text categorization, which is the task of assigning text documents to pre-specified classes (topics o r themes) of documents, is an important task that can help both in organizing as well as in finding information on these h uge resources. Text categorization presents unique challenges due to the large number of… CONTINUE READING
Highly Influential
This paper has highly influenced 31 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 534 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 288 extracted citations

CenKNN: a scalable and effective text classifier

Data Mining and Knowledge Discovery • 2014
View 14 Excerpts
Highly Influenced

Projected-prototype based classifier for text categorization

Knowl.-Based Syst. • 2013
View 6 Excerpts
Highly Influenced

Supervised term weighting centroid-based classifiers for text categorization

Knowledge and Information Systems • 2012
View 13 Excerpts
Highly Influenced

535 Citations

Citations per Year
Semantic Scholar estimates that this publication has 535 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 54 references

Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer

G. Salton.Automatic
View 4 Excerpts
Highly Influenced

The Nature of Statistical Learning Theory

V. Vapnic
Springer • 1995
View 4 Excerpts
Highly Influenced

An evaluation of statistical approaches to tex t categorization.Information Retrieval

Y. Yang
View 1 Excerpt


D. Boley, M. Gini, R. Gross, E. H. Han, K. Hastings
Kary pis, V. Kumar, B. Mobasher, and J. Moore. Partitioning-base d clustering for web document categorization. Decision Support Systems (accepted for publication) • 1999
View 1 Excerpt

Similar Papers

Loading similar papers…