A Unified Metric for Categorical and Numerical Attributes in Data Clustering
@inproceedings{Cheung2013AUM, title={A Unified Metric for Categorical and Numerical Attributes in Data Clustering}, author={Yiu-ming Cheung and H. Jia}, booktitle={PAKDD}, year={2013} }
Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. This paper therefore presents a general clustering framework based on the concept of object-cluster similarity and gives a unified similarity metric which… CONTINUE READING
Supplemental Presentations
10 Citations
A New Distance Metric for Unsupervised Learning of Categorical Data
- Mathematics, Computer Science
- IEEE Transactions on Neural Networks and Learning Systems
- 2014
- 60
- PDF
An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood
- Computer Science
- Knowl. Based Syst.
- 2017
- 36
A novel density peaks clustering algorithm for mixed data
- Mathematics, Computer Science
- Pattern Recognit. Lett.
- 2017
- 27
Clustering algorithm for mixed datasets using density peaks and Self-Organizing Generative Adversarial Networks
- Computer Science
- 2020
Machine learning algorithm for clustering of heart disease and chemoinformatics datasets
- Computer Science
- Comput. Chem. Eng.
- 2020
A new distance metric for unsupervised learning of categorical data
- Mathematics, Computer Science
- IJCNN
- 2014
- 4
References
SHOWING 1-10 OF 37 REFERENCES
A k-mean clustering algorithm for mixed numeric and categorical data
- Computer Science
- Data Knowl. Eng.
- 2007
- 475
- PDF
ROCK: a robust clustering algorithm for categorical attributes
- Computer Science
- Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)
- 1999
- 1,522
- PDF
CLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES
- Computer Science
- 1997
- 451
- Highly Influential
- PDF
A new initialization method for categorical data clustering
- Computer Science
- Expert Syst. Appl.
- 2009
- 113
- PDF
On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm
- Computer Science, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2007
- 170
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
- Mathematics, Computer Science
- Data Mining and Knowledge Discovery
- 2004
- 1,962
- PDF
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
- Mathematics, Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2007
- 72
A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining
- Computer Science
- DMKD
- 1997
- 533
- PDF