Estimation of melting points of fatty acids using homogeneously hybridized support vector regression
The partitioning or clustering method is an important research branch in data mining area, and it partitions the dataset into an arbitrary number k of clusters according to the correlation attribute of all elements of the dataset. Most datasets have the original clusters number, which is estimated with cluster validity index. But most current cluster validity index methods give the error estimation for most real datasets. In order to solve this problem, this paper applies the optimization technology of genetic algorithm to the new adaptive cluster validity index, which is called the gene index (GI). The algorithm applies genetic algorithm to adjust the weight value of the valuation function of adaptive cluster validity index to train an optimal cluster validity index. The algorithm is tested with many real datasets, and results show the proposed algorithm can give higher performance and accurately estimate the original cluster number of real datasets compared with the current cluster validity index methods.