• Publications
  • Influence
Detecting Novel Associations in Large Data Sets
A statistical method reveals relationships among variables in complex data sets. Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here,Expand
  • 1,405
  • 194
Network Applications of Bloom Filters: A Survey
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings oftenExpand
  • 2,007
  • 161
The Power of Two Choices in Randomized Load Balancing
  • M. Mitzenmacher
  • Computer Science
  • IEEE Trans. Parallel Distrib. Syst.
  • 1 October 2001
We consider the following natural model: customers arrive as a Poisson stream of rate /spl lambda/n, /spl lambda/<1, at a collection of n servers. Each customer chooses some constant d serversExpand
  • 1,081
  • 127
A Brief History of Generative Models for Power Law and Lognormal Distributions
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a lognormal distribution. In trying to learn enough about theseExpand
  • 1,503
  • 76
Efficient erasure correcting codes
We introduce a simple erasure recovery algorithm for codes derived from cascades of sparse bipartite graphs and analyze the algorithm by analyzing a corresponding discrete-time random process. As aExpand
  • 1,196
  • 71
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Preface 1. Events and probability 2. Discrete random variables and expectation 3. Moments and deviations 4. Chernoff bounds 5. Balls, bins and random graphs 6. The probabilistic method 7. MarkovExpand
  • 1,338
  • 67
Min-Wise Independent Permutations
We define and study the notion of min-wise independent families of permutations. We say that F?Sn (the symmetric group) is min-wise independent if for any set X?n and any x?X, when ? is chosen atExpand
  • 806
  • 67
Cuckoo Filter: Practically Better Than Bloom
In many networking systems, Bloom filters are used for high-speed set membership tests. They permit a small fraction of false positive answers with very good space efficiency. However, they do notExpand
  • 281
  • 60
Privacy Preserving Keyword Searches on Remote Encrypted Data
We consider the following problem: a user $\mathcal{U}$ wants to store his files in an encrypted form on a remote file server $\mathcal{S}$. Later the user $\mathcal{U}$ wants to efficiently retrieveExpand
  • 1,001
  • 58
Practical loss-resilient codes
We present randomized constructions of linear-time encodable and decodable codes that can transmit over lossy channels at rates extremely close to capacity. The encod-ing and decoding algorithms forExpand
  • 784
  • 52