Bloomier Filters: A second look

@inproceedings{Charles2008BloomierFA,
  title={Bloomier Filters: A second look},
  author={Denis Xavier Charles and Kumar Chellapilla},
  booktitle={ESA},
  year={2008}
}
A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of false-positives. A Bloomier filtergeneralizes a Bloom filter to compactly store a function with a static support. In this article we give a simple construction of a Bloomier filter. The construction is linear in space and requires constant time to evaluate. The creation of our Bloomier filter takes linear time which is faster than the existing… 

Figures from this paper

An Optimal Bloom Filter Replacement Based on Matrix Solving
TLDR
This work suggests a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters, and suggests a data structure that requires only nk bits space, has O (n) preprocessing time, and has a O (logn ) query time.
Difference Bloom Filter: A probabilistic structure for multi-set membership query
TLDR
A novel probabilistic data structure named Difference Bloom Filter (DBF) for fast multi-set membership query, which not only is more accurate than the state-of-the-art, but has a faster query speed.
Bloom filter variants for multiple sets: a comparative assessment
TLDR
The comparison of two probabilistic data structures for association queries derived from the well-known Bloom filter shows that the ShBF provides better space efficiency, but at a significantly higher computational cost than the SBF.
Chucky: A Succinct Cuckoo Filter for LSM-Tree
TLDR
This work proposes Chucky, a new design that replaces the multiple Bloom filters by a single Cuckoo filter that maps each data entry to an auxiliary address of its location within the LSM-tree, and achieves the best of both worlds: a modest access cost and a low false positive rate at the same time.
Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters
TLDR
Xor filters can be faster than Bloom and cuckoo filters while using less memory and it is found that a more compact version of xor filters (xor+) can use even less space than highly compact alternatives (e.g., Golomb-compressed sequences) while providing speeds competitive with Bloom filters.
Magic Cube Bloom Filter: Answering Membership Queries for Multiple Sets
TLDR
A novel data structure, namely Magic Cube Bloom Filter (MCBF), which outperforms the state-of-the-art in terms of accuracy and query speed with a limited memory usage and improves the query speed by utilizing spatial locality.
Conjunctive Filter: Breaking the Entropy Barrier
TLDR
The objective is to break this entropy bound and construct more space-efficient data structures and show that many problems can be solved by using a conjunctive filter such as full-text search and database join queries.
Matrix Bloom Filter: An Efficient Probabilistic Data Structure for 2-tuple Batch Lookup
TLDR
Through both theoretical and empirical studies, the performance of matrix Bloom filter is superior on datasets with common statistical distributions; and even without them, it just degrades to a standard Bloom filter.
Optimizing Bloom Filter: Challenges, Solutions, and Comparisons
TLDR
This survey provides an overview of BF and its variants, with an emphasis on the optimization techniques, and conducts an in-depth study of the existing literature on BF optimization, covering more than 60 variants.
A Novel Scalable and Storage‐Efficient Architecture for High Speed Exact String Matching
TLDR
A novel architecture based upon a recently proposed data structure called the Bloomier filter is proposed which can successfully support scalability and achieves better performance compared to other existing architectures measured in terms of throughput per logic cells per character as a metric.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
The Bloomier filter: an efficient data structure for static support lookup tables
TLDR
The Bloomier filter is introduced, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries and lower bounds are provided to prove the (near) optimality of the constructions.
Compressed bloom filters
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications
Network Applications of Bloom Filters: A Survey
TLDR
The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.
Spectral bloom filters
TLDR
The Spectral Bloom Filter is introduced, an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose multiplicities are below a threshold given at query time.
Designing a Bloom filter for differential file access
TLDR
The design process for a Bloom filter for an on-line student database is described, and it is shown that a very effective filter can be constructed with a modest expenditure of system resources.
Succinct Data Structures for Retrieval and Approximate Membership
TLDR
It is shown that for any k, query time O(k) can be beachieved using space that is within a factor 1 + e-k of optimal, asymptotically forlarge n.
PERF join: an alternative to two-way semijoin and bloomjoin
TLDR
This paper presents “Positionally Encoded Record Filters” (PERFs) and describes their use in a distributed query processing technique called PERF join and demonstrates through analytical studies thatPERF join performs significantly better than two-way Bloomjoin and two- way semijoin variants under a wide range of relevant cost parameter values.
Graphs, Hypergraphs and Hashing
TLDR
An infinite family of efficient and practical algorithms for generating minimal perfect hash functions which allow an arbitrary order to be specified for the keys is presented, and it is shown that almost all members of the family are space and time optimal.
Summary cache: a scalable wide-area web cache sharing protocol
TLDR
This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Computing Iceberg Queries Efficiently
TLDR
This work proposes efficient algorithms to evaluate iceberg queries using very little memory and significantly fewer passes over data, as compared to current techniques that use sorting or hashing.
...
1
2
3
...