Learn More
In this paper, we consider the problem of approximating the densest subgraph in the dynamic graph stream model. In this model of computation, the input graph is defined by an arbitrary sequence of edge insertions and deletions and the goal is to analyze properties of the resulting graph given memory that is sub-linear in the size of the stream. We present a(More)
We present space-efficient data stream algorithms for approximating the number of triangles in a graph up to a factor 1+ε. While it can be shown that determining whether a graph is triangle-free is not possible in sub-linear space, a large body of work has focused on minimizing the space required in terms of the number of triangles T (or a lower bound(More)
Consider a stream of n-tuples that empirically define the joint distribution of n discrete random variables X1, . . . , Xn. Previous work of Indyk and McGregor [6] and Braverman et al. [1, 2] addresses the problem of determining whether these variables are n-wise independent by measuring the `p distance between the joint distribution and the product(More)
We revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs(More)
Data streams typically have items of large number of dimensions. We study the fundamental heavyhitters problem in this setting. Formally, the data stream consists of x1, · · · ,xm where each xi = (xi,1, . . . ,xi,d) is a d-dimensional item, and each xi,j ∈ [n]. A k-dimensional subcube T is a subset of distinct coordinates {T1, · · · , Tk} ⊆ [d]. A subcube(More)
  • 1