Uncertain Data Clustering in Distributed Peer-to-Peer Networks.

Abstract

Uncertain data clustering has been recognized as an essential task in the research of data mining. Many centralized clustering algorithms are extended by defining new distance or similarity measurements to tackle this issue. With the fast development of network applications, these centralized methods show their limitations in conducting data clustering in a large dynamic distributed peer-to-peer network due to the privacy and security concerns or the technical constraints brought by distributive environments. In this paper, we propose a novel distributed uncertain data clustering algorithm, in which the centralized global clustering solution is approximated by performing distributed clustering. To shorten the execution time, the reduction technique is then applied to transform the proposed method into its deterministic form by replacing each uncertain data object with its expected centroid. Finally, the attribute-weight-entropy regularization technique enhances the proposed distributed clustering method to achieve better results in data clustering and extract the essential features for cluster identification. The experiments on both synthetic and real-world data have shown the efficiency and superiority of the presented algorithm.

DOI: 10.1109/TNNLS.2017.2677093

7 Figures and Tables

Cite this paper

@article{Zhou2017UncertainDC, title={Uncertain Data Clustering in Distributed Peer-to-Peer Networks.}, author={Jin Zhou and Long Chen and C L Philip Chen and Yingxu Wang and Han-Xiong Li}, journal={IEEE transactions on neural networks and learning systems}, year={2017} }