• Publications
  • Influence
An improved data stream summary: the count-min sketch and its applications
We introduce a new sublinear space data structure--the count-min sketch--for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, andExpand
  • 1,527
  • 245
  • PDF
Key differences between Web 1.0 and Web 2.0
Web 2.0 is a buzzword introduced in 2003-04 which is commonly used to encompass various novel phenomena on the World Wide Web. Although largely a marketing term, some of the key attributes associatedExpand
  • 565
  • 137
  • PDF
Differentially Private Spatial Decompositions
Differential privacy has recently emerged as the de facto standard for private data release. This makes it possible to provide strong theoretical guarantees on the privacy and utility of releasedExpand
  • 288
  • 45
  • PDF
Advances and Open Problems in Federated Learning
Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g.Expand
  • 264
  • 30
  • PDF
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
When dealing with massive quantities of data, top-k queries are a powerful technique for returning only the k most relevant tuples for inspection, based on a scoring function. The problem ofExpand
  • 195
  • 30
  • PDF
What's hot and what's not: tracking most frequent items dynamically
Most database management systems maintain statistics on the underlying relation. One of the important statistics is that of the “hot items” in the relation: those that appear many times (mostExpand
  • 375
  • 23
The string edit distance matching problem with moves
The edit distance between two strings S and R is defined to be the minimum number of character inserts, deletes and changes needed to convert R to S. Given a text string t of length n, and a patternExpand
  • 162
  • 21
  • PDF
Sketching Streams Through the Net: Distributed Approximate Query Tracking
Emerging large-scale monitoring applications require continuous tracking of complex data-analysis queries over collections of physically-distributed streams. Effective solutions have to beExpand
  • 211
  • 20
  • PDF
Sketching probabilistic data streams
The management of uncertain, probabilistic data has recently emerged as a useful paradigm for dealing with the inherent unreliabilities of several real-world application domains, including dataExpand
  • 179
  • 19
  • PDF
What's new: finding significant differences in network data streams
Monitoring and analyzing network traffic usage patterns is vital for managing IP Networks. An important problem is to provide network managers with information about changes in traffic, informingExpand
  • 95
  • 17