Document Representation and Dimension Reduction for Text Clustering

  title={Document Representation and Dimension Reduction for Text Clustering},
  author={M. Mahdi Shafiei and Singer Wang and Roger Zhang and Evangelos E. Milios and Bin Tang and Jane Tougas and Raymond J. Spiteri},
  journal={2007 IEEE 23rd International Conference on Data Engineering Workshop},
Increasingly large text damsels and the high dimensionality associated with natural language create a great challenge in text mining, In this research, a systematic study is conducted. in which three different document representation methods for text are used, together with three Dimension Reduction Techniques (DRT), in the context of the text clustering problem. Several standard benchmark datasets are used. The three Document representation methods considered are based on the vector space… CONTINUE READING