Nonparametric genetic clustering: comparison of validity indices

  title={Nonparametric genetic clustering: comparison of validity indices},
  author={Sanghamitra Bandyopadhyay and Ujjwal Maulik},
  journal={IEEE Trans. Syst. Man Cybern. Syst.},
A variable-string-length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a-priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. A cluster validity index is used as a measure of the fitness of a chromosome. The performance of several cluster validity indices… 
An Efficient GA-based Clustering Technique
In this paper, we propose a GA-based unsupervised clustering technique that selects cluster centers directly from the data set, allowing it to speed up the fitness evaluation by constructing a
A Point Symmetry-Based Clustering Technique for Automatic Evolution of Clusters
A new symmetry-based genetic clustering algorithm is proposed which automatically evolves the number of clusters as well as the proper partitioning from a data set using a newly proposed PS-based cluster validity index, sym-index, as a measure of the validity of the corresponding partitioning.
Finding the optimal number of clusters using genetic algorithms
The AGCUK algorithm is able to automatically provide the number of clusters and find the clustering partition and the Davies-Bouldin index is employed to measure the validity of the clusters.
Genetic Algorithm-based Text Clustering Technique: Automatic Evolution of Clusters with High Efficiency
  • Wei Song, Soon-cheol Park
  • Computer Science
    2006 Seventh International Conference on Web-Age Information Management Workshops
  • 2006
The superiority of the MVGA over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper Reuter text collection clusters in terms of number of clusters and clustering data sets.
A Novel Clustering Approach using Hierarchical Genetic Algorithms
The hierarchical genetic algorithm (HGA) is employed for automatically searching the number of clusters as well as properly locating the centers for clusters and the Davies-Bouldin index is adopted as a measure of the validity of the clusters.
Genetic Algorithm-Based Text Clustering Technique
The superiority of the MVGA over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper text clustering.
A Comparison Study of Validity Indices on Swarm-Intelligence-Based Clustering
  • R. Xu, Jie Xu, D. Wunsch
  • Computer Science
    IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
  • 2012
This work compares the performances of eight well-known and widely used clustering validity indices and finds that the silhouette statistic index stands out in most of the data sets that are examined.
A weighted sum validity function for clustering with a hybrid niching genetic algorithm
An objective function called the Weighted Sum Validity Function (WSVF), which is a weighted sum of the several normalized cluster validity functions, is suggested, which is generally able to improve the confidence of clustering solutions and achieve more accurate and robust results.
Hybridized Improved Genetic Algorithm with Variable Length Chromosome for Image Clustering
A Variable Length IGA is proposed which optimally finds the clusters of benchmark image datasets and the performance is compared with K-means and GCUK[12].
An Improved Genetic Algorithm for Text Clustering
The superiority of the improved genetic algorithm over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper text clustering.


Messy Genetic Algorithms: Motivation, Analysis, and First Results
The mGA presented herein repeatedly achieves globally optimal results without prior knowledge of good string arrangements, and it does so at the very first generation in which strings are long enough to cover the problem.
Some new indexes of cluster validity
This work reviews two clustering algorithms and three indexes of crisp cluster validity and shows that while Dunn's original index has operational flaws, the concept it embodies provides a rich paradigm for validation of partitions that have cloud-like clusters.
An ISODATA clustering procedure for symbolic objects using a distributed genetic algorithm
On finding the number of clusters
Genetic Algorithms in Search Optimization and Machine Learning
This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.
Pattern Recognition Principles
The present work gives an account of basic principles and available techniques for the analysis and design of pattern processing and recognition systems. Areas covered include decision functions,
A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters
Abstract Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space. In both cases, the extremizing
Algorithms for Clustering Data
A Cluster Separation Measure
A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster. The