# k-means Requires Exponentially Many Iterations Even in the Plane

@article{Vattani2011kmeansRE, title={k-means Requires Exponentially Many Iterations Even in the Plane}, author={Andrea Vattani}, journal={Discrete \& Computational Geometry}, year={2011}, volume={45}, pages={596-616} }

The k-means algorithm is a well-known method for partitioning n points that lie in the d-dimensional space into k clusters. Its main features are simplicity and speed in practice. Theoretically, however, the best known upper bound on its running time (i.e., nO(kd)) is, in general, exponential in the number of points (when kd=Ω(n/log n)). Recently Arthur and Vassilvitskii (Proceedings of the 22nd Annual Symposium on Computational Geometry, pp. 144–153, 2006) showed a super-polynomial worst-case…

## 115 Citations

k-means requires exponentially many iterations even in the plane

- Computer Science, MathematicsSCG '09
- 2009

This work proves the existence of super-polynomial lower bounds for any d≥ 2 and improves the lower bound, by presenting a simple construction in the plane that leads to the exponential lower bound 2Ω(n).

Exact algorithms for size constrained 2-clustering in the plane

- Computer Science, MathematicsTheor. Comput. Sci.
- 2016

An approximation algorithm for the uniform capacitated k-means problem

- Computer Science, Mathematics
- 2020

Based on the technique of local search, a bi-criteria approximation algorithm is presented, which has a constant approximation guarantee and violates the cardinality constraint within a constant factor, for the UC-k-means.

Clustering Perturbation Resilient Instances

- Computer ScienceArXiv
- 2018

This work considers stable instances of Euclidean $k-means that have provable polynomial time algorithms for recovering optimal cluster and proposes simple algorithms with running time linear in the number of points and the dimension that provably recover the optimal clustering.

The seeding algorithm for spherical k-means clustering with penalties

- Computer Science
- 2020

It is proved that when against spherical k-means clustering with penalties but on separable instances, the algorithm is with an approximation ratio $$2\max \{3,M+1\}$$ with high probability, where M is the ratio of the maximal and the minimal penalty cost of the given data set.

A distance saving approach to the K-means problem for massive data

- Computer Science
- 2016

Experimental results indicate that the proposed approximation to the solution of the K-means problem outperforms well-known approaches in terms of the relation between number of computations and the quality of the approximation.

On the minimum of the mean-squared error in 2-means clustering

- MathematicsInvolve, a Journal of Mathematics
- 2019

We study the minimum mean-squared error for 2-means clustering when the outcomes of the vector-valued random variable to be clustered are on two touching spheres of unit radius in $n$-dimensional…

Sketching and Clustering Metric Measure Spaces

- Computer Science, MathematicsArXiv
- 2018

A duality between general classes of clustering and sketching problems is demonstrated, and it is proved that whereas the gap between these can be arbitrarily large, in the case of doubling metric spaces the resulting sketching objectives are polynomially related.

Analysis of Ward's Method

- Computer ScienceSODA
- 2019

It is shown that Ward's method computes a $2-approximation with respect to the $k$-means objective function if the optimal $k-clustering is well separated, and that Ward produces an $\mathcal{O}(1)$ -approximative clustering for one-dimensional data sets.

Scalable K-Means++

- Computer ScienceProc. VLDB Endow.
- 2012

It is proved that the proposed initialization algorithm k-means|| obtains a nearly optimal solution after a logarithmic number of passes, and Experimental evaluation on real-world large-scale data demonstrates that k-Means|| outperforms k- means++ in both sequential and parallel settings.