A Unified Metric for Categorical and Numerical Attributes in Data Clustering

@inproceedings{Cheung2013AUM,
  title={A Unified Metric for Categorical and Numerical Attributes in Data Clustering},
  author={Yiu-ming Cheung and H. Jia},
  booktitle={PAKDD},
  year={2013}
}
  • Yiu-ming Cheung, H. Jia
  • Published in PAKDD 2013
  • Computer Science
  • Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. This paper therefore presents a general clustering framework based on the concept of object-cluster similarity and gives a unified similarity metric which… CONTINUE READING
    10 Citations
    Using Categorical Attributes for Clustering
    A New Distance Metric for Unsupervised Learning of Categorical Data
    • 60
    • PDF
    An Efficient Technique for Clustering Data with Mixed Attribute Types
    • PDF
    A novel density peaks clustering algorithm for mixed data
    • 27
    A new distance metric for unsupervised learning of categorical data
    • 4

    References

    SHOWING 1-10 OF 37 REFERENCES
    A k-mean clustering algorithm for mixed numeric and categorical data
    • 475
    • PDF
    ROCK: a robust clustering algorithm for categorical attributes
    • 1,522
    • PDF
    CLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES
    • 451
    • Highly Influential
    • PDF
    A new initialization method for categorical data clustering
    • 113
    • PDF
    On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm
    • 170
    Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
    • J. Huang
    • Mathematics, Computer Science
    • Data Mining and Knowledge Discovery
    • 2004
    • 1,962
    • PDF
    Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
    • 72
    LIMBO: Scalable Clustering of Categorical Data
    • 262
    • PDF
    COOLCAT: an entropy-based algorithm for categorical clustering
    • 422
    • PDF