# Near-Optimal Compression of Probabilistic Counting Sketches for Networking Applications

@inproceedings{Scheuermann2007NearOptimalCO, title={Near-Optimal Compression of Probabilistic Counting Sketches for Networking Applications}, author={Bj{\"o}rn Scheuermann and Martin Mauve}, booktitle={DIALM-POMC}, year={2007} }

Sketches—data structures for probabilistic, duplicate insensitive counting—are central building blocks of a number of recently proposed network protocols, for example in the context of wireless sensor networks. They can be used to perform robust, distributed data aggregation in a broad range of settings and applications. However, the structure of these sketches is very redundant, making effective compression vital if they are to be transmitted over a network. Here, we propose lossless…

No Paper Link Available

## 20 Citations

A survey of sketches in traffic measurement: Design, Optimization, Application and Implementation

- Computer Science
- 2020

This work introduces the preparation of flows for measurement, then detail the most recent investigations of design, aggregation, decoding, application and implementation of sketches for network measurement, covering more than 90 sketch designs and optimization strategies.

Sketch for traffic measurement: design, optimization, application and implementation

- Computer ScienceArXiv
- 2020

This work introduces the preparation of flows for measurement, then details the most recent investigations of design, aggregation, decoding, application and implementation of sketches for network measurement, and conducts an in-depth study of the existing literature.

Non-Mergeable Sketching for Cardinality Estimation

- Computer Science, MathematicsICALP
- 2021

It is proved that the Martingale transform is optimal in the non-mergeable world, and that the Fishmonger sketch in particular is optimal among linearizable sketches, with an MVP of $H_0/2 \approx 1.63$.

How to Make Private Distributed Cardinality Estimation Practical, and Get Differential Privacy for Free

- Computer ScienceIACR Cryptol. ePrint Arch.
- 2020

It is revealed that if the cardinality to be estimated is large enough, the protocol can achieve (ε,δ)-differential privacy automatically, without requiring any additional manipulation of the output, which signifies a new approach for achieving differential privacy that departs from the mainstream approach.

High-Speed Per-Flow Traffic Measurement with Probabilistic Multiplicity Counting

- Computer Science
- 2010

Probabilistic Multiplicity Counting (PMC) is presented, a novel data structure that is capable of accounting traffic per flow probabilistically and provides very accurate traffic statistics.

Approximating Private Set Union/Intersection Cardinality With Logarithmic Complexity

- Computer Science, MathematicsIEEE Transactions on Information Forensics and Security
- 2017

Efficient approximate protocols, whose accuracy can be tuned according to application requirements are proposed, which are derived from the PSU-CA protocol with virtually no cost and can hide its output.

Cardinality Estimation for Elephant Flows

- 2017

For many practical applications, it is a fundamental problem to estimate the flow cardinalities over big network data consisting of numerous flows (especially a large quantity of mouse flows mixed…

Cardinality Estimation for Elephant Flows: A Compact Solution Based on Virtual Register Sharing

- Computer ScienceIEEE/ACM Transactions on Networking
- 2017

A unified framework of virtual estimators is proposed that allows the idea of sharing to apply to an array of cardinality estimation solutions, e.g., HyperLogLog and PCSA, achieving far better memory efficiency than the best existing work.

A probabilistic method for cooperative hierarchical aggregation of data in VANETs

- Computer ScienceAd Hoc Networks
- 2010

This work proposes soft-state sketches-an extension of Flajolet-Martin sketches-as a probabilistic approximation for the hierarchical aggregation of observations in dissemination-based, distributed traffic information systems, which is duplicate insensitive and results in a very flexible aggregate construction and a high quality of the aggregates.

Distributed super point cardinality estimation under sliding time window for high speed network

- Computer ScienceArXiv
- 2018

The algorithm proposed in this paper could detect super points and estimate their cardinalities under sliding time window in real time and devises a novel reversible hash function scheme to restore super point from a pool of AT.

## References

SHOWING 1-10 OF 13 REFERENCES

Approximate aggregation techniques for sensor databases

- Computer ScienceProceedings. 20th International Conference on Data Engineering
- 2004

This work generalizes well known duplicate-insensitive sketches for approximating COUNT to handle SUM and presents and analyze methods for using sketches to produce accurate results with low communication and computation overhead, and presents an extensive experimental validation of the methods.

Counting by Coin Tossings

- Computer ScienceASIAN
- 2004

This text is an informal review of several randomized algorithms that have appeared over the past two decades and have proved instrumental in extracting efficiently quantitative characteristics of…

Synopsis diffusion for robust aggregation in sensor networks

- Computer ScienceSenSys '04
- 2004

This paper presents a general framework for achievingantly more accurate and reliable answers by combining energy-efficient multi-path routing schemes with techniques that avoid double-counting, and demonstrates the significant robustness, accuracy, and energy-efficiency improvements of synopsis diffusion over previous approaches.

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

- Mathematics
- 2007

This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data…

Bitmap Algorithms for Counting Active Flows on High-Speed Links

- Computer ScienceIEEE/ACM Transactions on Networking
- 2006

A family of bitmap algorithms that address the problem of counting the number of distinct header patterns (flows) seen on a high-speed link and can be used to detect DoS attacks and port scans and to solve measurement problems.

Probabilistic Counting Algorithms for Data Base Applications

- Computer ScienceJ. Comput. Syst. Sci.
- 1985

A class of probabilistic counting algorithms with which one can estimate the number of distinct elements in a large collection of data in a single pass using only a small additional storage and only a few operations per element scanned is introduced.

Efficient and decentralized computation of approximate global state

- Computer ScienceCCRV
- 2006

The need for efficient computation of approximate global state lies at the heart of a wide range of problems in distributed systems and solving these problems can radically improve the design of robust, efficient and self-managed distributed systems.

Probabilistic aggregation for data dissemination in VANETs

- Computer ScienceVANET '07
- 2007

An algorithm for the hierarchical aggregation of observations in dissemination-based, distributed traffic information systems that overcomes two central problems of existing aggregation schemes for VANET applications and contains a modified Flajolet-Martin sketch as a probabilistic approximation.

Loglog counting of large cardinalities

- Mathematics
- 2003

Using an auxiliary memory smaller than the size of this abstract, the LOGLOG algorithm makes it possible to estimate in a single pass and within a few percents the number of different words in the…

Order statistics and estimating cardinalities of massive data sets

- Computer Science, MathematicsDiscret. Appl. Math.
- 2009

A new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data is introduced here. It is based on order statistics rather than…