Incremental Subspace Clustering over Multiple Data Streams

Abstract

Data streams are often locally correlated, with a subset of streams exhibiting coherent patterns over a subset of time points. Subspace clustering can discover clusters of objects in different subspaces. However, traditional sub- space clustering algorithms for static data sets are not readily used for incremental clustering, and is very expensive for frequent re-clustering over dynamically changing stream data. In this paper, we present an efficient incremental sub- space clustering algorithm for multiple streams over sliding windows. Our algorithm detects all the delta-CC-Clusters, which capture the coherent changing patterns among a set of streams over a set of time points. delta-CC'-Cluster s are incrementally generated by traversing a directed acyclic graph pDAG. We propose efficient insertion and deletion operations to update the pDAG dynamically. In addition, effective pruning techniques are applied to reduce the search space. Experiments on real data sets demonstrate the performance of our algorithm.

DOI: 10.1109/ICDM.2007.100

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Zhang2007IncrementalSC, title={Incremental Subspace Clustering over Multiple Data Streams}, author={Qi Zhang and Jinze Liu and Wei Wang}, journal={Seventh IEEE International Conference on Data Mining (ICDM 2007)}, year={2007}, pages={727-732} }