k-Means for Streaming and Distributed Big Sparse Data

Abstract

We provide the first streaming algorithm for computing a provable approximation to the k-means of sparse Big data. Here, sparse Big Data is a set of n vectors in R, where each vector has O(1) non-zeroes entries, and d ≥ n. E.g., adjacency matrix of a graph, web-links, social network, document-terms, or image-features matrices. Our streaming algorithm stores… (More)
DOI: 10.1137/1.9781611974348.39

7 Figures and Tables

Topics

Statistics

0102020162017
Citations per Year

Citation Velocity: 7

Averaging 7 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.
  • Presentations referencing similar topics