Network Applications of Bloom Filters: A Survey

@article{Broder2003NetworkAO,
  title={Network Applications of Bloom Filters: A Survey},
  author={Andrei Z. Broder and Michael Mitzenmacher},
  journal={Internet Mathematics},
  year={2003},
  volume={1},
  pages={485 - 509}
}
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in database applications since the 1970s, but only in recent years have they become popular in the networking literature. The aim of this paper is to survey the ways in which Bloom filters have been used… 
Broder and Mitzenmacher : Network Applications of Bloom Filters : A Survey 487
TLDR
The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.
Incremental Bloom Filters
TLDR
This work considers the problem of minimizing the memory requirements in cases where the number of elements in the set is not known in advance but the distribution or moment information of the numberof elements is known and shows how to exploit such information to minimize the expected amount of memory required for the filter.
A Way of Eliminating Errors When Using Bloom Filters for Routing in Computer Networks
TLDR
A way of labeling links in a computer network is presented which prevents errors in Bloom filters in some routing scenarios and, therefore, results in a more efficient use of network resources.
Distance-Sensitive Bloom Filters
TLDR
It is demonstrated how appropriate data structures can be designed using locality-sensitive hash functions as a building block, and the performance of a natural scheme under the Hamming metric is analyzed.
Yes-no Bloom filter: A way of representing sets with fewer false positives
TLDR
This paper proposes the yes-no BF, a new way of representing a set, based on the BF, but with significantly lower false positives and no false negatives, and shows that it has better false positive performance than the BF.
Optimization of Compact Set Membership Representation for Distributed Computing
TLDR
This master thesis shows that at least for a specific area in the parameter space Bloom filters are significantly outperformed even by trivial methods, and shows that the unconditional use of Bloom filters is questionable.
Designing a New Bloom Filter-based Index for Distributed Data Management ⋆
TLDR
This work proposes a selective insertion method of bloom filter to reduce the workload of BFs by finding an optimal load ratio and shows that this new approach can reduce the false lookup time by 36% compared with the pure bloom filter approach.
ERROR DETECTION AND CORRECTION USING BLOOM FILTERS
TLDR
This paper considers the design space and the evaluation of a series of extensions to increase the practicality and performance of iBFs, to enable false-negative-free element deletion, and to provide security enhancements.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 60 REFERENCES
Compressed bloom filters
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications
The Bloomier filter: an efficient data structure for static support lookup tables
TLDR
The Bloomier filter is introduced, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries and lower bounds are provided to prove the (near) optimality of the constructions.
Spectral bloom filters
TLDR
The Spectral Bloom Filter is introduced, an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose multiplicities are below a threshold given at query time.
A second look at bloom filters
TLDR
It is shown that an analytical approach can yield insights into competing filter design and give expected values for the goodness-of-hash transformations not available with simulation.
Designing a Bloom filter for differential file access
TLDR
The design process for a Bloom filter for an on-line student database is described, and it is shown that a very effective filter can be constructed with a modest expenditure of system resources.
PERF join: an alternative to two-way semijoin and bloomjoin
TLDR
This paper presents “Positionally Encoded Record Filters” (PERFs) and describes their use in a distributed query processing technique called PERF join and demonstrates through analytical studies thatPERF join performs significantly better than two-way Bloomjoin and two- way semijoin variants under a wide range of relevant cost parameter values.
Optimal Semijoins for Distributed Database Systems
A Bloom-filter-based semijoin algorithm for distributed database systems is presented. This algorithm reduces communications costs to process a distributed natural join as much as possible with a
Probabilistic location and routing
  • Sean C. Rhea, J. Kubiatowicz
  • Computer Science
    Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies
  • 2002
We propose probabilistic location to enhance the performance of existing peer-to-peer location mechanisms in the case where a replica for the queried data item exists close to the query source. We
Forwarding without loops in Icarus
  • A. Whitaker, D. Wetherall
  • Computer Science
    2002 IEEE Open Architectures and Network Programming Proceedings. OPENARCH 2002 (Cat. No.02EX571)
  • 2002
TLDR
Icarus is a framework for detecting forwarding loops in experimental protocols by adding a small Bloom filter to the packet header to probabilistically detect looping behavior and it is found that the scheme is simple, efficient and widely applicable like a TFL, yet ensures significantly earlier loop detection.
A scalable content-addressable network
TLDR
The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.
...
1
2
3
4
5
...