Scalability for Clustering Algorithms Revisited

  title={Scalability for Clustering Algorithms Revisited},
  author={Fredrik Farnstrom and James Lewis and Charles Elkan},
  journal={SIGKDD Explorations},
This paper presents a simple new algorithm that performs k-means clustering in one scan of a dataset, while using a buffer for points from the dataset of fixed size. Experiments show that the new method is several times faster than standard k-means, and that it produces clusterings of equal or almost equal quality. The new method is a simplification of an algorithm due to Bradley, Fayyad and Reina that uses several data compression techniques in an at tempt to improve speed and clustering… CONTINUE READING
Highly Influential
This paper has highly influenced 18 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 249 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 145 extracted citations

249 Citations

Citations per Year
Semantic Scholar estimates that this publication has 249 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…