An Optimal Bloom Filter Replacement Based on Matrix Solving

@article{Porat2009AnOB,
  title={An Optimal Bloom Filter Replacement Based on Matrix Solving},
  author={Ely Porat},
  journal={ArXiv},
  year={2009},
  volume={abs/0804.1845}
}
  • E. Porat
  • Published 11 April 2008
  • Computer Science
  • ArXiv
We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys, each mapped to value which is a string of k bits. Our suggested method requires nk + o (n ) bits space to store the dictionary, and O (n ) time to produce the data structure, and allows answering a membership query in O (1) memory probes. The dictionary size… 
Conjunctive Filter: Breaking the Entropy Barrier
TLDR
The objective is to break this entropy bound and construct more space-efficient data structures and show that many problems can be solved by using a conjunctive filter such as full-text search and database join queries.
Bloom Filters, Adaptivity, and the Dictionary Problem
TLDR
It is shown that adaptivity can be achieved effectively for free because it is possible to maintain an AMQ that uses the same amount of local space as a non-adaptive AMQ, performs all queries and updates in constant time, and guarantees that each negative query to the dictionary accesses remote storage with probability epsilon, independent of the results of past queries.
Succinct Data Structures for Retrieval and Approximate Membership
TLDR
It is shown that for any k, query time O(k) can be beachieved using space that is within a factor 1 + e-k of optimal, asymptotically forlarge n.
Random hypergraphs for hashing-based data structures
TLDR
This thesis examines how hyperedge distribution and load affects the probabilities with which these properties hold and derive corresponding thresholds, and identifies a hashing scheme that leads to a particularly high threshold value in this regard.
Experimental Variations of a Theoretically Good Retrieval Data Structure
TLDR
The practicability of one such theoretically very good proposal that has linear construction time, constant evaluation time and space consumption O(nr) bits is explored, bridging a gap between theory and real data structures.
Fully-Dynamic Space-Efficient Dictionaries and Filters with Constant Number of Memory Accesses
TLDR
This is the first space-efficient fully-dynamic dictionary that maintains both sets and random multisets and supports queries, insertions, and deletions with a constant number of memory accesses in the worst case with high probability.
Support Optimality and Adaptive Cuckoo Filters
TLDR
A new Adaptive Cuckoo Filter is designed, and it is shown to be support optimal over any n queries when storing a set of size n, and to be the first practical data structure that is support optimal, and the first support optimal filter that does not require additional space beyond a normal cuckoo filter.
A Space Lower Bound for Dynamic Approximate Membership Data Structures
An approximate membership data structure is a randomized data structure representing a set which supports membership queries. It allows for a small false positive error rate but has no false negative
Fast Succinct Retrieval and Approximate Membership using Ribbon
TLDR
B bumped ribbon retrieval (BuRR) is presented, the first practical succinct retrieval data structure, which achieves space overheads well below 1 % while being faster than most previously used retrieval data structures (typically with spaceOverheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead ≥ 44 %).
Monotone minimal perfect hashing: searching a sorted table with O(1) accesses
TLDR
This work considers the problem of monotone minimal perfect hashing, in which the bijection is required to preserve the lexicographical ordering of the keys, and shows how to compute the predecessor (in the sorted order of S) of an arbitrary element, using O(1) accesses in expectation and an index of O(n log w) bits.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 19 REFERENCES
Succinct Data Structures for Retrieval and Approximate Membership
TLDR
It is shown that for any k, query time O(k) can be beachieved using space that is within a factor 1 + e-k of optimal, asymptotically forlarge n.
Spectral bloom filters
TLDR
The Spectral Bloom Filter is introduced, an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose multiplicities are below a threshold given at query time.
Compressed bloom filters
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications
The Bloomier filter: an efficient data structure for static support lookup tables
TLDR
The Bloomier filter is introduced, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries and lower bounds are provided to prove the (near) optimality of the constructions.
Bloomier Filters: A second look
TLDR
This article gives a simple construction of a Bloomier filter, a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of false-positives.
An optimal Bloom filter replacement
TLDR
A new RAM data structure is considered for storing an approximation of S to S such that S ⊆ S and any element not in S belongs to S with probability at most ∈, and the space usage is within a lower order term of the lower bound.
An Algorithm for Approximate Membership checking with Application to Password Security
Space/time trade-offs in hash coding with allowable errors
TLDR
Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Network Applications of Bloom Filters: A Survey
TLDR
The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.
OPUS: Preventing weak password choices
...
1
2
...