#### Filter Results:

- Full text PDF available (92)

#### Publication Year

2004

2017

- This year (3)
- Last 5 years (53)
- Last 10 years (96)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Chao Li, Michael Hay, Vibhor Rastogi, Gerome Miklau, Andrew McGregor
- PODS
- 2010

Differential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known.
We propose the matrix mechanism, a new algorithm for answering a workload of predicate counting queries. Given a workload, the… (More)

- Andrew McGregor
- APPROX-RANDOM
- 2005

We present algorithms for finding large graph matchings in the streaming model. In this model, applicable when dealing with massive graphs, edges are streamed-in in some arbitrary order rather than residing in randomly accessible memory. For ǫ > 0, we achieve a 1 1+ǫ approximation for maximum cardinality matching and a 1 2+ǫ approximation to maximum… (More)

We investigate the importance of space when solving problems based on graph distance in the streaming model. In this model, the input graph is presented as a stream of edges in an arbitrary order. The main computational restriction of the model is that we have limited space and therefore cannot store all the streamed data; we are forced to make… (More)

- Amit Chakrabarti, Graham Cormode, Andrew McGregor
- SODA
- 2007

We describe a simple algorithm for approximating the empirical entropy of a stream of <i>m</i> values in a single pass, using <i>O</i>(ε<sup>-2</sup> log(Δ<sup>-1</sup>) log <i>m</i>) words of space. Our algorithm is based upon a novel extension of a method introduced by Alon, Matias, and Szegedy [1]. We show a space lower bound of… (More)

- Daniel W. Barowy, Charlie Curtsinger, Emery D. Berger, Andrew McGregor
- Commun. ACM
- 2012

Humans can perform many tasks with ease that remain difficult or impossible for computers. Crowdsourcing platforms like Amazon's Mechanical Turk make it possible to harness human-based computational power at an unprecedented scale. However, their utility as a general-purpose computational platform remains limited. The lack of complete automation makes it… (More)

- Graham Cormode, Andrew McGregor
- PODS
- 2008

There is an increasing quantity of data with uncertainty arising from applications such as sensor network measurements, record linkage, and as output of mining algorithms. This uncertainty is typically formalized as probability density functions over tuple values. Beyond storing and processing such data in a DBMS, it is necessary to perform other data… (More)

- Andrew McGregor, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, Salil P. Vadhan
- 2010 IEEE 51st Annual Symposium on Foundations of…
- 2010

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp… (More)

- Kook Jin Ahn, Sudipto Guha, Andrew McGregor
- PODS
- 2012

When processing massive data sets, a core task is to construct <i>synopses</i> of the data. To be useful, a synopsis data structure should be easy to construct while also yielding good approximations of the relevant properties of the data set. A particularly useful class of synopses are <i>sketches</i>, i.e., those based on linear projections of the data.… (More)

- Andrew McGregor
- SIGMOD Record
- 2014

Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was two-fold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine and b) considering graph problems yields new insights into… (More)

Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invested in designing efficient algorithms for finding such "high discrepancy" regions, with methods ranging from fast heuristics for special cases, to general grid-based methods, and to… (More)