# Recombinator-k-Means: An Evolutionary Algorithm That Exploits k-Means++ for Recombination

@article{Baldassi2019RecombinatorkMeansAE,
title={Recombinator-k-Means: An Evolutionary Algorithm That Exploits k-Means++ for Recombination},
author={Carlo Baldassi},
journal={IEEE Transactions on Evolutionary Computation},
year={2019},
volume={26},
pages={991-1003}
}
• Carlo Baldassi
• Published 1 May 2019
• Computer Science
• IEEE Transactions on Evolutionary Computation
We introduce an evolutionary algorithm called recombinator-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-means for optimizing the highly nonconvex kmeans problem. Its defining feature is that its crossover step involves all the members of the current generation, stochastically recombining them with a repurposed variant of the <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-means++ seeding algorithm. The recombination also uses a…
5 Citations

## Figures and Tables from this paper

A meta-method for initializing (seeding) the k -means clustering algorithm called PNN-smoothing, which consists in splitting a given dataset into J random subsets, clustering each of them individually, and merging the resulting clusterings with the pairwise-nearest-neighbor (PNN) method.
• Chuan Wu
• Computer Science
2022 IEEE 2nd International Conference on Data Science and Computer Application (ICDSCA)
• 2022
This paper proposes a clustering algorithm improvement strategy based on the density parameter optimization algorithm that optimizes the clustering method and clustering characteristics of the clustered algorithm to be better applied to the data analysis process.
• Computer Science
2022 4th Blockchain and Internet of Things Conference
• 2022
An optimization method for the deployment of edge computing terminals considering node division in the Electric Internet of Things, aiming at minimizing the delay when processing tasks is proposed.
• Computer Science
Journal of Big Data
• 2022
Experimental results of both adaptive and non-adaptive state-of-the-art methods on industrial HDI datasets illustrate that ADMA achieves a desirable global optimum with reasonable overhead and prevails competing methods in terms of predicting the missing data in HDI matrices.

## References

SHOWING 1-10 OF 35 REFERENCES

• Computer Science
SODA '07
• 2007
By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.

### How much k-means can be improved by using better initialization and repeats? Pattern Recognition, 2019

• 2019
The main results are that the expected time complexity of the random swap algorithm has (1) linear dependency on the number of data vectors, (2) quadratic dependency onThe number of clusters, and (3) inverse dependent on the size of neighborhood.
A survey of multiparent operators that have been introduced over the years in evolutionary computing is given and the traditional mutation-or-crossover debate is reformulate in the light of such operators.
• Computer Science
1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360)
• 1998
The results showed clearly that multi-parent recombinations lead to better performance, although the performance improvement for different techniques were found to be dependent on problems.
• Computer Science
Applied Intelligence
• 2018
The results show that overlap is critical, and that k-means starts to work effectively when the overlap reaches 4% level.
• Computer Science
NIPS
• 2017
A sparse embedded $k-means clustering algorithm which requires$\mathcal{O}(nnz(X))$($nnz (X)$denotes the number of non-zeros in$X\$) for fast matrix multiplication and improves on [1]'s results for approximation accuracy by a factor of one.
• Computer Science
Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)
• 1999
A new method for reducing the number of distance calculations in the generalized Lloyd algorithm (GLA), which is a widely used method to construct a codebook in vector quantization, that detects the activity of the code vectors and utilizes it on the classification of the training vectors.