# CLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES

@inproceedings{Huang1997CLUSTERINGLD, title={CLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES}, author={Zhexue Huang}, year={1997} }

Efficient partitioning of large data sets into homogenous clusters is a fundamental problem in data mining. [... ] Key Method In the algorithm, objects are clustered against k prototypes. A method is developed to dynamically update the k prototypes in order to maximise the intra cluster similarity of objects. When applied to numeric data the algorithm is identical to the kmeans. To assist interpretation of clusters we use decision tree induction algorithms to create rules for clusters. These rules, together with… Expand

