Research of text clustering based on improved VSM by TF under the framework of Mahout


Currently, the data that dealt by traditional text clustering methods is small. Text representation model that text clustering used is traditional vector space model (VSM). The traditional text clustering has defect of low efficiency when processing big data. The quality is bad when using traditional VSM model for text representation. To solve these two… (More)


3 Figures and Tables