Clustering heterogeneous web usage data using Hierarchical Particle Swarm Optimization

Abstract

Data clustering aims to group data based on similarities between the data elements. Recently, due to the increasing complexity and amount of heterogenous data, modeling of such data for clustering has become a serious challenge. In this paper we tackle the problem of modeling heterogeneous web usage data for clustering. The main contribution is a new similarity measure which we propose to cluster heterogeneous web usage data. We then use this similarity measure in our Particle Swarm Optimization (PSO) based clustering algorithm, Hierarchical Particle Swarm Optimization based clustering (HPSO-clustering). HPSO-clustering combines the qualities of hierarchical and partitional clustering to cluster data in a hierarchical agglomerative manner. We present the clustering results and explain the effects of the new similarity measure on inter-cluster and intra-cluster distances. These measures verify the applicability of the proposed similarity measure on web usage data.

DOI: 10.1109/SIS.2013.6615172

6 Figures and Tables

Cite this paper

@article{Alam2013ClusteringHW, title={Clustering heterogeneous web usage data using Hierarchical Particle Swarm Optimization}, author={Shafiq Alam and Gillian Dobbie and Yun Sing Koh and Patricia Riddle}, journal={2013 IEEE Symposium on Swarm Intelligence (SIS)}, year={2013}, pages={147-154} }