# Nonparametric genetic clustering: comparison of validity indices

@article{Bandyopadhyay2001NonparametricGC, title={Nonparametric genetic clustering: comparison of validity indices}, author={Sanghamitra Bandyopadhyay and Ujjwal Maulik}, journal={IEEE Trans. Syst. Man Cybern. Syst.}, year={2001}, volume={31}, pages={120-125} }

A variable-string-length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a-priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. A cluster validity index is used as a measure of the fitness of a chromosome. The performance of several cluster validity indices…

## Figures and Tables from this paper

## 270 Citations

An Efficient GA-based Clustering Technique

- Computer Science
- 2005

In this paper, we propose a GA-based unsupervised clustering technique that selects cluster centers directly from the data set, allowing it to speed up the fitness evaluation by constructing a…

A Point Symmetry-Based Clustering Technique for Automatic Evolution of Clusters

- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2008

A new symmetry-based genetic clustering algorithm is proposed which automatically evolves the number of clusters as well as the proper partitioning from a data set using a newly proposed PS-based cluster validity index, sym-index, as a measure of the validity of the corresponding partitioning.

Finding the optimal number of clusters using genetic algorithms

- Computer Science2008 IEEE Conference on Cybernetics and Intelligent Systems
- 2008

The AGCUK algorithm is able to automatically provide the number of clusters and find the clustering partition and the Davies-Bouldin index is employed to measure the validity of the clusters.

Genetic Algorithm-based Text Clustering Technique: Automatic Evolution of Clusters with High Efficiency

- Computer Science2006 Seventh International Conference on Web-Age Information Management Workshops
- 2006

The superiority of the MVGA over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper Reuter text collection clusters in terms of number of clusters and clustering data sets.

A Novel Clustering Approach using Hierarchical Genetic Algorithms

- Computer ScienceIntell. Autom. Soft Comput.
- 2005

The hierarchical genetic algorithm (HGA) is employed for automatically searching the number of clusters as well as properly locating the centers for clusters and the Davies-Bouldin index is adopted as a measure of the validity of the clusters.

Genetic Algorithm-Based Text Clustering Technique

- Computer ScienceICNC
- 2006

The superiority of the MVGA over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper text clustering.

A Comparison Study of Validity Indices on Swarm-Intelligence-Based Clustering

- Computer ScienceIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
- 2012

This work compares the performances of eight well-known and widely used clustering validity indices and finds that the silhouette statistic index stands out in most of the data sets that are examined.

A weighted sum validity function for clustering with a hybrid niching genetic algorithm

- Computer ScienceIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
- 2005

An objective function called the Weighted Sum Validity Function (WSVF), which is a weighted sum of the several normalized cluster validity functions, is suggested, which is generally able to improve the confidence of clustering solutions and achieve more accurate and robust results.

Hybridized Improved Genetic Algorithm with Variable Length Chromosome for Image Clustering

- Computer Science
- 2007

A Variable Length IGA is proposed which optimally finds the clusters of benchmark image datasets and the performance is compared with K-means and GCUK[12].

An Improved Genetic Algorithm for Text Clustering

- Computer ScienceCIT 2014
- 2014

The superiority of the improved genetic algorithm over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper text clustering.

## References

SHOWING 1-10 OF 17 REFERENCES

Messy Genetic Algorithms: Motivation, Analysis, and First Results

- Computer ScienceComplex Syst.
- 1989

The mGA presented herein repeatedly achieves globally optimal results without prior knowledge of good string arrangements, and it does so at the very first generation in which strings are long enough to cover the problem.

Some new indexes of cluster validity

- Computer ScienceIEEE Trans. Syst. Man Cybern. Part B
- 1998

This work reviews two clustering algorithms and three indexes of crisp cluster validity and shows that while Dunn's original index has operational flaws, the concept it embodies provides a rich paradigm for validation of partitions that have cloud-like clusters.

An ISODATA clustering procedure for symbolic objects using a distributed genetic algorithm

- Computer SciencePattern Recognit. Lett.
- 1999

Genetic Algorithms in Search Optimization and Machine Learning

- Computer Science
- 1988

This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.

Pattern Recognition Principles

- Computer Science
- 1974

The present work gives an account of basic principles and available techniques for the analysis and design of pattern processing and recognition systems. Areas covered include decision functions,…

A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters

- Mathematics
- 1973

Abstract Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space. In both cases, the extremizing…

A Cluster Separation Measure

- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 1979

A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster. The…