Learn More
The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. In this paper we present two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical(More)
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation because of its efficiency in clustering large data sets. However, working only on numeric values limits its use in data mining because data sets in data mining often contain categorical(More)
This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given.(More)
— This correspondence describes extensions to the fuzzy k k k-means algorithm for clustering categorical data. By using a simple matching dissimilarity measure for categorical objects and modes instead of means for clusters, a new approach is developed, which allows the use of the k k k-means paradigm to efficiently cluster large categorical data sets. A(More)
This paper proposes two models for predicting the completion time of jobs in a service Grid. The single service model predicts the completion time of a job in a Grid that provides only one type of service. The multiple services model predicts the completion time of a job that runs in a Grid which offers multiple types of services. We have developed two(More)
Community detection is an important issue in social network analysis. Most existing methods detect communities through analyzing the linkage of the network. The drawback is that each community identified by those methods can only reflect the strength of connections, but it cannot reflect the semantics such as the interesting topics shared by people. To(More)