# Convergence Properties of the K-Means Algorithms

@inproceedings{Bottou1994ConvergencePO, title={Convergence Properties of the K-Means Algorithms}, author={L{\'e}on Bottou and Yoshua Bengio}, booktitle={NIPS}, year={1994} }

This paper studies the convergence properties of the well known K-Means clustering algorithm. The K-Means algorithm can be described either as a gradient descent algorithm or by slightly extending the mathematics of the EM algorithm to this hard threshold case. We show that the K-Means algorithm actually minimizes the quantization error using the very fast Newton algorithm.

## 477 Citations

### k-Means Clustering via the Frank-Wolfe Algorithm

- Computer ScienceLWDA
- 2016

It is shown that k-means clustering is a matrix factorization problem and how the constrained optimization steps involved in this procedure can be solved efficiently using the Frank-Wolfe algorithm.

### Convergence of online k-means

- Computer Science, MathematicsAISTATS
- 2022

Convergence is proved by extending techniques used in optimization literature to handle settings where center-speciﬁc learning rates may depend on the past trajectory of the centers.

### Selection of K in K-means clustering

- Computer Science
- 2005

Existing methods for selecting the number of clusters for the K-means algorithm are reviewed and a new measure to assist the selection is proposed.

### A Local Search Approach to K-Clustering

- Computer Science
- 1999

This paper analytically derive a clustering algorithm which is based on a Local Search algorithm, and proves that A-LKM, as applied to the problem of clustering subclusters, preserve the monotone convergence property.

### Self-Adaptive k-Means Based on a Covering Algorithm

- Computer ScienceComplex.
- 2018

Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

### On the Lower Bound of Local Optimums in K-Means Algorithm

- Computer ScienceSixth International Conference on Data Mining (ICDM'06)
- 2006

This paper proposes an efficient method to compute a lower bound on the cost of the local optimum from the current center set and shows that this method can greatly prune the unnecessary iterations and improve the efficiency of the algorithm in most of the data sets, especially with high dimensionality and large k.

### A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

- Computer ScienceExpert Syst. Appl.
- 2013

### How the initialization affects the stability of the $k$-means algorithm

- Computer Science
- 2009

This paper investigates the role of the initialization for the stability of the қ-means clustering algorithm and analyzes when different initializations lead to the same local optimum, and when they lead to different local optima.

### Web-scale k-means clustering

- Computer ScienceWWW '10
- 2010

This work proposes the use of mini-batch optimization for k-means clustering, which reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent.

### The analysis of a simple k-means clustering algorithm

- Computer ScienceSCG '00
- 2000

This paper presents a simple and efficient implementation of Lloyd's k-means clustering algorithm, which it differs from most other approaches in that it precomputes a kd-tree data structure for the data points rather than the center points.

## References

SHOWING 1-10 OF 12 REFERENCES

### Note on Learning Rate Schedules for Stochastic Optimization

- Computer ScienceNIPS
- 1990

"search-then-converge" type schedules which outperform the classical constant and "running average" (1/t) schedules both in speed of convergence and quality of solution.

### Some methods for classification and analysis of multivariate observations

- Mathematics
- 1967

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give…

### Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures

- Computer Science
- 1991

An unsupervised algorithm which is an alternative to the classical winner-take-all competitive algorithms and a supervised modular architecture in which a number of simple "expert" networks compete to solve distinct pieces of a large task are considered.

### Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper

- Engineering
- 1977

Vibratory power unit for vibrating conveyers and screens comprising an asynchronous polyphase motor, at least one pair of associated unbalanced masses disposed on the shaft of said motor, with the…

### Self-Organization and Associative Memory

- Computer Science
- 1988

The purpose and nature of Biological Memory, as well as some of the aspects of Memory Aspects, are explained.

### Optimisation par descente de gradient stochastique de systemes modulaires combinant reseaux de neurones et programmation dynamique. Application a la reconnaissance de la parole

- Philosophy
- 1994

Ce memoire est consacre a l'etude de systemes modulaires associant reseaux de neurones (mlp) et programmation dynamique (dp), ainsi qu'a leur application a la reconnaissance de la parole. Il est…

### PTAH on continuous multivariate functions of Markov chains

- Technical Report 80193,
- 1976