• Publications
  • Influence
Counting Distinct Elements in a Data Stream
We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± ?. Our algorithms improve upon known algorithms for this problem, and offer a spectrumExpand
  • 464
  • 58
  • PDF
Index Coding With Side Information
TLDR
We show that for natural classes of side information graphs, including directed acyclic graphs, perfect graphs, odd holes, and odd anti-holes, minrank is the optimal length of arbitrary INDEX codes. Expand
  • 271
  • 51
Index Coding With Side Information
TLDR
We show that for natural classes of side information graphs, including directed acyclic graphs, perfect graphs, odd holes, and odd anti-holes, minrank is the optimal length of arbitrary INDEX codes. Expand
  • 299
  • 45
  • PDF
An information statistics approach to data stream and communication complexity
TLDR
We present a new method for proving strong lower bounds in communication complexity. Expand
  • 409
  • 24
  • PDF
An information statistics approach to data stream and communication complexity
TLDR
We present a new method for proving strong lower bounds in communication complexity, based on the notion of the conditional information complexity, of a function which is the minimum amount of information about the inputs that has to be revealed by a communication protocol for the function. Expand
  • 202
  • 22
Exponential separation of quantum and classical one-way communication complexity
TLDR
We give the first exponential separation between quantum and bounded-error randomized one-way communication complexity. Expand
  • 118
  • 16
  • PDF
OLAP over uncertain and imprecise data
TLDR
We extend the OLAP data model to represent data ambiguity, specifically imprecision and uncertainty, and introduce an allocation-based approach to the semantics of aggregation queries over such data. Expand
  • 202
  • 15
  • PDF
Estimating the sortedness of a data stream
TLDR
The distance to monotonicity of a sequence is the minimum number of edit operations required to transform the sequence into an increasing order; this measure is complementary to the length of the longest increasing subsequence (LIS). Expand
  • 67
  • 13
  • PDF
Approximating edit distance efficiently
TLDR
We develop algorithms that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than /spl lscr/, decide which of the two holds. Expand
  • 130
  • 11
  • PDF
Efficient aggregation algorithms for probabilistic data
TLDR
We study the problem of computing aggregation operators on probabilistic data in an I/O efficient manner. Expand
  • 101
  • 10
  • PDF