We study clustering under the data stream model of computation where: given a sequence of points, the objective is to maintain a consistently good clustering of the sequence observed so far.Expand

We study clustering problems in the streaming model, where the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storage space using a randomized algorithm which produces a constant factor approximation in one pass using storage space O(k poly log n).Expand

We present a novel technique for solving two important and related problems in the sliding window model---maintaining variance and maintaining a <i>k</i>--median clustering.Expand

We give a sampling-based algorithm for the k-Median problem, with fully polynomial running time that is independent of n, the size of the data set.Expand

We consider the problem of estimating the length of a shortest path in a DAG whose edge lengths are known only approximately but can be determined exactly at a cost.Expand

Query optimization that involves expensive predicates has received considerable attention in the database community. Typically, the output to a database query is a set of tuples that satisfy certain… Expand