A new <it>k</it>-means clustering algorithm for data streams of points from a Euclidean space that provides a good alternative to BIRCH and StreamLS, in particular, if the number of cluster centers is large.Expand

The authors' coresets with the merge-and-reduce approach obtain embarrassingly parallel streaming algorithms for problems such as k-means, PCA and projective clustering, and a simple recursive coreset construction that produces coresets of size.Expand

Two space bounded random sampling algorithms that compute an approximation of the number of triangles in an undirected graph given as a stream of edges are presented and they provide a basic tool to analyze the structure of large graphs.Expand

Every unweighted point set P has a weak coreset of size Poly(k,1/ε) for the k-means clustering problem, i.e. its size is independent of the cardinality |P| of the point set and the dimension d of the Euclidean space R<sup>d</sup>.Expand

It is shown that the structure of a planar graph on large enough number of vertices, n, and with constant maximum degree d, is determined, up to the modification (insertion or deletion) of at most ε d n edges, by the frequency of k-discs for certain k=k(ε,d) that is independent of the size of the graph.Expand

This work develops streaming (1 + ε)-approximation algorithms for k-median, k-means, MaxCut, maximum weighted matching (MaxWM), maximum travelling salesperson, maximum spanning tree, and average distance over dynamic geometric data streams.Expand

A negative result is shown, namely, that no strongly sublinear sized coresets exist for logistic regression, and to deal with intractable worst-case instances, a complexity measure $\mu(X)$ is introduced, which quantifies the hardness of compressing a data set for Logistic regression.Expand

The first linear time (1 + ε)-approximation algorithms for the k-median problem in an arbitrary metric space with bounded doubling dimension, for the Kullback-Leibler divergence (relative entropy), for Mahalanobis distances, and for some special cases of Bregman divergences are obtained.Expand

A new data structure is developed that maintains a set of elements chosen (almost) uniformly at random from P that maintains an ε-approximation of the weight of the Euclidean minimum spanning tree of P with constant VC-dimension.Expand

A hunter strategy for general graphs with an escape length of only $\O(n \log (\diam(G)))$ against restricted as well as unrestricted rabbits is found, which is close to optimal since $\Omega(n)$ is a trivial lower bound on the escape length in both models.Expand