• Corpus ID: 1672253

Convergence Properties of the K-Means Algorithms

  title={Convergence Properties of the K-Means Algorithms},
  author={L{\'e}on Bottou and Yoshua Bengio},
This paper studies the convergence properties of the well known K-Means clustering algorithm. The K-Means algorithm can be described either as a gradient descent algorithm or by slightly extending the mathematics of the EM algorithm to this hard threshold case. We show that the K-Means algorithm actually minimizes the quantization error using the very fast Newton algorithm. 

Figures from this paper

k-Means Clustering via the Frank-Wolfe Algorithm

It is shown that k-means clustering is a matrix factorization problem and how the constrained optimization steps involved in this procedure can be solved efficiently using the Frank-Wolfe algorithm.

Convergence of online k-means

Convergence is proved by extending techniques used in optimization literature to handle settings where center-specific learning rates may depend on the past trajectory of the centers.

Selection of K in K-means clustering

Existing methods for selecting the number of clusters for the K-means algorithm are reviewed and a new measure to assist the selection is proposed.

A Local Search Approach to K-Clustering

This paper analytically derive a clustering algorithm which is based on a Local Search algorithm, and proves that A-LKM, as applied to the problem of clustering subclusters, preserve the monotone convergence property.

Self-Adaptive k-Means Based on a Covering Algorithm

Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

On the Lower Bound of Local Optimums in K-Means Algorithm

This paper proposes an efficient method to compute a lower bound on the cost of the local optimum from the current center set and shows that this method can greatly prune the unnecessary iterations and improve the efficiency of the algorithm in most of the data sets, especially with high dimensionality and large k.

How the initialization affects the stability of the $k$-means algorithm

This paper investigates the role of the initialization for the stability of the қ-means clustering algorithm and analyzes when different initializations lead to the same local optimum, and when they lead to different local optima.

Web-scale k-means clustering

This work proposes the use of mini-batch optimization for k-means clustering, which reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent.

The analysis of a simple k-means clustering algorithm

This paper presents a simple and efficient implementation of Lloyd's k-means clustering algorithm, which it differs from most other approaches in that it precomputes a kd-tree data structure for the data points rather than the center points.



Note on Learning Rate Schedules for Stochastic Optimization

"search-then-converge" type schedules which outperform the classical constant and "running average" (1/t) schedules both in speed of convergence and quality of solution.

Some methods for classification and analysis of multivariate observations

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give

Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures

An unsupervised algorithm which is an alternative to the classical winner-take-all competitive algorithms and a supervised modular architecture in which a number of simple "expert" networks compete to solve distinct pieces of a large task are considered.

Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper

Vibratory power unit for vibrating conveyers and screens comprising an asynchronous polyphase motor, at least one pair of associated unbalanced masses disposed on the shaft of said motor, with the

Self-Organization and Associative Memory

The purpose and nature of Biological Memory, as well as some of the aspects of Memory Aspects, are explained.

Optimisation par descente de gradient stochastique de systemes modulaires combinant reseaux de neurones et programmation dynamique. Application a la reconnaissance de la parole

Ce memoire est consacre a l'etude de systemes modulaires associant reseaux de neurones (mlp) et programmation dynamique (dp), ainsi qu'a leur application a la reconnaissance de la parole. Il est

Artificial neural networks and their application to sequence recognition

PTAH on continuous multivariate functions of Markov chains

  • Technical Report 80193,
  • 1976