Telescoping Filter: A Practical Adaptive Filter

@inproceedings{Lee2021TelescopingFA,
  title={Telescoping Filter: A Practical Adaptive Filter},
  author={David J. Lee and Samuel McCauley and Shikha Singh and Maximilian Stein},
  booktitle={ESA},
  year={2021}
}
Filters are small, fast, and approximate set membership data structures. They are often used to filter out expensive accesses to a remote set S for negative queries (that is, filtering out queries x ∉ S). Filters have one-sided errors: on a negative query, a filter may say "present" with a tunable false-positive probability of e. Correctness is traded for space: filters only use log (1/e) + O(1) bits per element. The false-positive guarantees of most filters, however, hold only for a single… 

Figures from this paper

Thesis for the degree Bet-or-Pass: Adversarially Robust Bloom Filters

TLDR
The goal is that a robust Bloom filter should behave like a random biased coin even for an adaptive adversary, and the notion of Bet-or-Pass as capturing the desired properties of such a data structure is highlighted.

Workload-Adaptive Filtering in Storage Engines

TLDR
This work presents an adaptive filter that remembers frequent false positives to turn them into true negatives for future queries and shows that this adaptive filter can provide up to 2x end-to-end throughput improvement.

Bet-or-Pass: Adversarially Robust Bloom Filters

A Bloom filter is a data structure that maintains a succinct and probabilistic representation of a set S ⊆ U of elements from a universe U . It supports approximate membership queries. The price of

References

SHOWING 1-10 OF 35 REFERENCES

Support Optimality and Adaptive Cuckoo Filters

TLDR
A new Adaptive Cuckoo Filter is designed, and it is shown to be support optimal over any n queries when storing a set of size n, and to be the first practical data structure that is support optimal, and the first support optimal filter that does not require additional space beyond a normal cuckoo filter.

Mitigating False Positives in Filters: to Adapt or to Cache?

TLDR
This work proves that an adaptive filter has a lower falsepositive rate when the adversary is stochastic, and analyzes the broom filter against queries drawn from a Zipfian distribution to validate the analysis empirically.

Bloom Filters, Adaptivity, and the Dictionary Problem

TLDR
It is shown that adaptivity can be achieved effectively for free because it is possible to maintain an AMQ that uses the same amount of local space as a non-adaptive AMQ, performs all queries and updates in constant time, and guarantees that each negative query to the dictionary accesses remote storage with probability epsilon, independent of the results of past queries.

Spectral bloom filters

TLDR
The Spectral Bloom Filter is introduced, an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose multiplicities are below a threshold given at query time.

Ribbon filter: practically smaller than Bloom and Xor

TLDR
The Ribbon filter is introduced: a new filter for static sets with a broad range of configurable space overheads and false positive rates with competitive speed over that range, especially for larger f ≥ 2−7.

Weighted Bloom filter

TLDR
The traditional Bloom filter is generalized to weighted Bloom filter, which incorporates the information on the query frequencies and the membership likelihood of the elements into its optimal design, and it is shown that the adapted Bloom filter always outperforms theTraditional Bloom filter.

Bloom Filters in Adversarial Environments

TLDR
This work considers a data structure known as a “Bloom filter” and proves a tight connection between Bloom filters in this model and cryptography and shows that non-trivial (memory-wise) Bloom filters exist if and only if one-way functions exist.

A General-Purpose Counting Filter: Making Every Bit Count

TLDR
A new general-purpose AMQ, the counting quotient filter (CQF), which is small and fast, has good locality of reference, scales out of RAM to SSD, and supports deletions, counting, resizing, merging, and highly concurrent access.

Stacked Filters: Learning to Filter by Structure

TLDR
Stacked Filters is presented, a new probabilistic filter which is fast and robust similar to query-agnostic filters, and at the same time brings low false positive rates and sizes similar to classifier-based filters.

Morton Filters: Faster, Space-Efficient Cuckoo Filters via Biasing, Compression, and Decoupled Logical Sparsity

TLDR
This work introduces the Morton filter (MF), a novel AS-MDS that introduces several key improvements to CFs, and typically uses comparable to slightly less space than CFs for the same epsis.