## Document Clustering in Correlation Similarity Measure Space

- Taiping Zhang, Yuan Yan Tang, Bin Fang, Yong Xiang
- IEEE Transactions on Knowledge and Data…
- 2012

- Published 2013

Clustering is the process of grouping related documents from the large collection of database. The mining of such related documents from the enormous database which are unlabelled is a challenging one. To overcome this problem, clustering is used to filter the unlabelled documents from the large collection of database. Clustering can be achieved by various algorithms that differ significantly in their notion and how to efficiently find them. The Standard K-Means algorithm is a well known data mining algorithm which can effectively cluster data in the database. K-mean is a simple algorithm that has been adapted to many problem domains. Hence by using k-mean, the initializations of number of clusters can be done through manually. In this research paper, a new technique DLCT (Double Link Cluster Tree) is merged with the enhanced K-Mean algorithm which helps to makes clustering in an efficient manner by without initializing of number of clusters and optimal clusters. The result of k-mean with DLCT, which allows automatic determination of number of clusters on any type of data such as documents, images etc. General Terms Effective Clustering Using DLCT

@inproceedings{Raj2013AutomatedCI,
title={Automated Clustering in K-Means Using Double Link Cluster Tree (DLCT)},
author={R.Ranga Raj},
year={2013}
}