Automatic Subspace Clustering of High Dimensional Data
@article{Agrawal2005AutomaticSC, title={Automatic Subspace Clustering of High Dimensional Data}, author={R. Agrawal and J. Gehrke and D. Gunopulos and P. Raghavan}, journal={Data Mining and Knowledge Discovery}, year={2005}, volume={11}, pages={5-33} }
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates… CONTINUE READING
Figures, Tables, and Topics from this paper
351 Citations
Locally adaptive metrics for clustering high dimensional data
- Mathematics, Computer Science
- Data Mining and Knowledge Discovery
- 2006
- 221
- PDF
A weighting k-modes algorithm for subspace clustering of categorical data
- Mathematics, Computer Science
- Neurocomputing
- 2013
- 43
- PDF
Mining Projected Clusters in High-Dimensional Spaces
- Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2009
- 59
- PDF
A Comprehensive Study of Challenges and Approaches for Clustering High Dimensional Data
- Computer Science
- 2014
- 2
High-Dimensional Clustering Method for High Performance Data Mining
- Computer Science
- International Conference on Computational Science
- 2007
- PDF
References
SHOWING 1-10 OF 63 REFERENCES
Automatic subspace clustering of high dimensional data for data mining applications
- Computer Science
- SIGMOD '98
- 1998
- 2,588
- PDF
Finding generalized projected clusters in high dimensional spaces
- Computer Science
- SIGMOD '00
- 2000
- 514
- PDF
BIRCH: an efficient data clustering method for very large databases
- Computer Science
- SIGMOD '96
- 1996
- 4,579
- Highly Influential
- PDF
CURE: an efficient clustering algorithm for large databases
- Computer Science
- SIGMOD '98
- 1998
- 2,769
- Highly Influential
- PDF
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
- Computer Science
- KDD
- 1996
- 15,114
- Highly Influential
- PDF
A Numerical Classification Method for Partitioning of a Large Multidimensional Mixed Data Set
- Mathematics
- 1979
- 5
Efficient and Effective Clustering Methods for Spatial Data Mining
- Computer Science
- VLDB
- 1994
- 2,018
- Highly Influential
- PDF