Effect of term distributions on centroid-based text categorization


Most of traditional text categorization approaches utilize term frequency (tf) and inverse document frequency (idf) for representing importance of words and/or terms in classifying a text document. This paper describes an approach to apply term distributions, in addition to tf and idf, to improve performance of centroid-based text categorization. Three… (More)
DOI: 10.1016/j.ins.2003.07.007


