Marcel R. Ackermann

Learn More
We study a generalization of the <i>k</i>-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set <i>P</i> of size <i>n</i>, our goal is to find a set <i>C</i> of size <i>k</i> such that the sum of errors D(<i>P,C</i>) &equals; &sum;<sub><i>p</i> &in; <i>P</i></sub> min<sub><i>c</i> &in; <i>C</i></sub> {D(<i>p,c</i>)} is(More)
We study the generalized k-median problem with respect to a Bregman divergence D φ. Given a finite set P ⊆ R d of size n, our goal is to find a set C of size k such that the sum of errors cost(P, C) = p∈P min c∈C D φ (p, c) is minimized. The Bregman k-median problem plays an important role in many applications , e.g. information theory, statistics, text(More)
We develop a new &lt;it&gt;k&lt;/it&gt;-means clustering algorithm for data streams of points from a Euclidean space. We call this algorithm StreamKM++. Our algorithm computes a small weighted sample of the data stream and solves the problem on the sample using the &lt;it&gt;k&lt;/it&gt;-means++ algorithm of Arthur and Vassilvitskii (SODA '07). To compute(More)
The diameter k-clustering problem is the problem of partitioning a finite subset of ℝ d into k subsets called clusters such that the maximum diameter of the clusters is minimized. One early clustering algorithm that computes a hierarchy of approximate solutions to this problem (for all values of k) is the agglomerative clustering algorithm with the complete(More)
We prove the computational hardness of three k-clustering problems using an (almost) arbitrary Bregman divergence as dissimilarity measure: (a) The Bregman k-center problem, where the objective is to find a set of centers that minimizes the maximum dissimilarity of any input point towards its closest center, and (b) the Bregman k-diameter problem, where the(More)
Digital storage demand is growing with the increasing use of digital artifacts from media files to business documents. Regulatory frameworks ask for unaltered, durable storage of business communications. In this paper we consider the problem of getting reliable evidence of the integrity and existence of some data from a storage service even if the data is(More)
  • 1