#### Filter Results:

- Full text PDF available (6)

#### Publication Year

2015

2017

- This year (2)
- Last 5 years (6)
- Last 10 years (6)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Andrew McGregor, Hoa T. Vu
- ICDT
- 2017

We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe {1, · · · , n}, find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor 1− 1/e in polynomial time. In the streaming-set model, the sets and… (More)

- Andrew McGregor, David Tench, Sofya Vorotnikova, Hoa T. Vu
- MFCS
- 2015

In this paper, we consider the problem of approximating the densest subgraph in the dynamic graph stream model. In this model of computation, the input graph is defined by an arbitrary sequence of edge insertions and deletions and the goal is to analyze properties of the resulting graph given memory that is sub-linear in the size of the stream. We present a… (More)

- Andrew McGregor, Sofya Vorotnikova, Hoa T. Vu
- PODS
- 2016

We present space-efficient data stream algorithms for approximating the number of triangles in a graph up to a factor 1+ε. While it can be shown that determining whether a graph is triangle-free is not possible in sub-linear space, a large body of work has focused on minimizing the space required in terms of the number of triangles T (or a lower bound… (More)

- Andrew McGregor, Hoa T. Vu
- COCOON
- 2015

Consider a stream of n-tuples that empirically define the joint distribution of n discrete random variables X1, . . . , Xn. Previous work of Indyk and McGregor [6] and Braverman et al. [1, 2] addresses the problem of determining whether these variables are n-wise independent by measuring the `p distance between the joint distribution and the product… (More)

- Michael A. Bender, Samuel McCauley, Andrew McGregor, Shikha Singh, Hoa T. Vu
- ISAAC
- 2015

We revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs… (More)

- Branislav Kveton, S. Muthukrishnan, Hoa T. Vu
- ArXiv
- 2017

Data streams typically have items of large number of dimensions. We study the fundamental heavyhitters problem in this setting. Formally, the data stream consists of x1, · · · ,xm where each xi = (xi,1, . . . ,xi,d) is a d-dimensional item, and each xi,j ∈ [n]. A k-dimensional subcube T is a subset of distinct coordinates {T1, · · · , Tk} ⊆ [d]. A subcube… (More)

- ‹
- 1
- ›