# Spectral bloom filters

@inproceedings{Cohen2003SpectralBF, title={Spectral bloom filters}, author={Saar Cohen and Y. Matias}, booktitle={SIGMOD '03}, year={2003} }

A Bloom Filter is a space-efficient randomized data structure allowing membership queries over sets with certain allowable errors. It is widely used in many applications which take advantage of its ability to compactly represent a set, and filter out effectively any element that does not belong to the set, with small error probability. This paper introduces the Spectral Bloom Filter (SBF), an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose…

## 409 Citations

A Scalable Bloom Filter for Membership Queries

- Computer ScienceIEEE GLOBECOM 2007 - IEEE Global Telecommunications Conference
- 2007

A new design of a scalable Bloom filter (SBF) for an expanding data set that keeps a low false positive rate by adding Bloom filter vectors with double length when necessary and outperforms other current scalable Bloom filters significantly.

Adaptive Bloom filter

- Computer Science
- 2006

The traditional Bloom filter is generalized to Adaptive Bloom Filter, which incorporates the information on the query frequencies and the membership likelihood of the elements into its optimal design, and it is shown that the adapted Bloom filter always outperforms theTraditional Bloom filter.

Weighted Bloom filter

- Computer Science2006 IEEE International Symposium on Information Theory
- 2006

The traditional Bloom filter is generalized to weighted Bloom filter, which incorporates the information on the query frequencies and the membership likelihood of the elements into its optimal design, and it is shown that the adapted Bloom filter always outperforms theTraditional Bloom filter.

Cardinality Computing: A New Step Towards Fully Representing Multi-sets by Bloom Filters

- Computer ScienceWISE
- 2006

Two novel algorithms for computing cardinalities of multi-sets represented by Bloom Filters are introduced, which extend the functionality of the Bloom Filter and thus make it usable in a variety of new applications.

Theory and Network Applications of Dynamic Bloom Filters

- Computer ScienceProceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications
- 2006

This paper proves that DBF can control the false positive probability at a low level by adjusting the number of standard bloom filters used according to the actual size of current dynamic set, and presents multidimension dynamic bloom filters (MDDBF) to support concise representation and approximate membership queries of dynamic sets in multiple attribute dimensions.

Optimizing data popularity conscious bloom filters

- Computer SciencePODC '08
- 2008

This paper studies the problem of minimizing the false-positive probability of a Bloom filter by adapting the number of hashes used for each data object to its popularity in sets and membership queries and proposes two polynomial-time solutions with bounded approximation ratios.

i-DBF: an Improved Bloom Filter Representation Method on Dynamic Set

- Computer Science2006 Fifth International Conference on Grid and Cooperative Computing Workshops
- 2006

It has been proved that DBF not only possess the advantage of standard bloom filter, but also has better features when dealing with dynamic set, and this improved dynamic bloom filter i-DBF has better performance both in the storage space and in the false positive probability.

The Dynamic Bloom Filters

- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2010

This work proposes dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms.

Suitability of a new Bloom filter for numerical vectors with high dimensions

- Computer SciencePloS one
- 2018

A new uniform Prime-HD-BKDERhash family and a new Bloom filter (P-HDBF) to retrieve the membership of a big data set with the numerical high dimensions and provides an efficient solution alternative to implement membership search with space-time overheads.

Multiple Set Matching and Pre-Filtering with Bloom Multifilters

- Computer ScienceArXiv
- 2019

This article proposes two efficient Bloom Multifilters called Bloom Matrix and Bloom Vector which are space efficient and answer queries with a set of identifiers for multiple set matching problems and shows that the space efficiency can be optimized further according to the distribution of labels among multiple sets: Uniform and Zipf.

## References

SHOWING 1-10 OF 30 REFERENCES

Compressed bloom filters

- Computer SciencePODC '01
- 2001

A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications…

Network Applications of Bloom Filters: A Survey

- Computer ScienceInternet Math.
- 2003

The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.

Maintaining Stream Statistics over Sliding Windows

- Computer Science, MathematicsSIAM J. Comput.
- 2002

The problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far, is considered, and it is shown that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits of memory, the number of 1's can be estimated to within a factor of $1 + \ep silon$.

Succinct Dynamic Data Structures

- Computer ScienceWADS
- 2001

P succinct data structures are developed to represent a sequence of values to support partial sum and select queries and update and a dynamic array which supports insertion, deletion and access of an element at any given index.

Space/time trade-offs in hash coding with allowable errors

- Computer ScienceCACM
- 1970

Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.

Bifocal sampling for skew-resistant join size estimation

- MathematicsSIGMOD '96
- 1996

The estimate obtained by the bifocal sampling algorithm is proven to lie with high probability within a small constant factor of the actual join size, regardless of the skew, as long as the join size is Ω(n lg n), for relations consisting of n tuples.

Computing Iceberg Queries Efficiently

- Computer ScienceVLDB
- 1998

This work proposes efficient algorithms to evaluate iceberg queries using very little memory and significantly fewer passes over data, as compared to current techniques that use sorting or hashing.

Designing a Bloom filter for differential file access

- Computer ScienceCACM
- 1982

The design process for a Bloom filter for an on-line student database is described, and it is shown that a very effective filter can be constructed with a modest expenditure of system resources.

Fixed-precision estimation of join selectivity

- MathematicsPODS '93
- 1993

A partial ordering that compares the variability of the estimators for the different procedures after an arbitrary fixed number of sampling steps leads to a new algorithm for fixed-precision estimation of the selectivity of an equijoin that appears to be the best available when there are no indices on the join key.

New sampling-based summary statistics for improving approximate query answers

- Computer ScienceSIGMOD '98
- 1998

This paper introduces two new sampling-based summary statistics, concise samples and counting samples, and presents new techniques for their fast incremental maintenance regardless of the data distribution, and considers their application to providing fast approximate answers to hot list queries.