The impact of semi-supervised clustering on text classification

Abstract

This paper addresses the problem of learning to classify texts by exploiting information derived from clustering both training and testing sets. The incorporation of knowledge resulting from clustering into the feature space representation of the texts is expected to boost the performance of a classifier. Two different approaches to clustering are described, an unsupervised and a semi-supervised one. We present an empirical study of the proposed algorithms on a variety of datasets. The results are encouraging, revealing that information resulting from clustering can create text classifiers of high-accuracy.

DOI: 10.1145/2491845.2491866

6 Figures and Tables

Cite this paper

@inproceedings{Kyriakopoulou2013TheIO, title={The impact of semi-supervised clustering on text classification}, author={Antonia Kyriakopoulou and Theodore Kalamboukis}, booktitle={Panhellenic Conference on Informatics}, year={2013} }