Succinct Data Structures for Retrieval and Approximate Membership
@inproceedings{Dietzfelbinger2008SuccinctDS, title={Succinct Data Structures for Retrieval and Approximate Membership}, author={Martin Dietzfelbinger and R. Pagh}, booktitle={ICALP}, year={2008} }
The retrieval problemis the problem of associatingdata with keys in a set. Formally, the data structure must store afunction $f\colon U\to \{0,1\}^r$ that has specified values on theelements of a given set S⊆ U, |S|= n, but may have any value on elements outsideS. All known methods (e. g. those based on perfect hashfunctions), induce a space overhead of θ(n)bits over the optimum, regardless of the evaluation time. We showthat for any k, query time O(k) can beachieved using space that is within…
77 Citations
Fast Succinct Retrieval and Approximate Membership using Ribbon
- Computer ScienceArXiv
- 2021
B bumped ribbon retrieval (BuRR) is presented, the first practical succinct retrieval data structure, which achieves space overheads well below 1 % while being faster than most previously used retrieval data structures (typically with spaceOverheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead ≥ 44 %).
Conjunctive Filter: Breaking the Entropy Barrier
- Computer ScienceALENEX
- 2010
The objective is to break this entropy bound and construct more space-efficient data structures and show that many problems can be solved by using a conjunctive filter such as full-text search and database join queries.
A Space Lower Bound for Dynamic Approximate Membership Data Structures
- Computer ScienceSIAM J. Comput.
- 2013
An approximate membership data structure is a randomized data structure representing a set which supports membership queries. It allows for a small false positive error rate but has no false negative…
An Optimal Bloom Filter Replacement Based on Matrix Solving
- Computer ScienceCSR
- 2009
This work suggests a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters, and suggests a data structure that requires only nk bits space, has O (n) preprocessing time, and has a O (logn ) query time.
A Lower Bound for Dynamic Approximate Membership Data Structures
- Computer Science2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010
A new lower bound for the memory requirements of any dynamic approximate membership data structure is shown, which shows that the entropy lower bound cannot be achieved by dynamic data structures for any constant error rate.
Constant-Time Retrieval with O(log m) Extra Bits
- Computer ScienceSTACS
- 2019
This paper presents a method for treating the retrieval problem with overhead ε = O((logm)/m), which corresponds to O(1) extra memory words (O(logm) bits), and an extremely simple, constant-time query operation.
Random hypergraphs for hashing-based data structures
- Computer Science
- 2020
This thesis examines how hyperedge distribution and load affects the probabilities with which these properties hold and derive corresponding thresholds, and identifies a hashing scheme that leads to a particularly high threshold value in this regard.
Experimental Variations of a Theoretically Good Retrieval Data Structure
- Computer ScienceESA
- 2009
The practicability of one such theoretically very good proposal that has linear construction time, constant evaluation time and space consumption O(nr) bits is explored, bridging a gap between theory and real data structures.
Tight Bounds for Sliding Bloom Filters
- Computer ScienceAlgorithmica
- 2015
This work considers a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter.
How to Approximate a Set without Knowing Its Size in Advance
- Computer Science2013 IEEE 54th Annual Symposium on Foundations of Computer Science
- 2013
A data structure that uses (1+o(1)n log(1/ε)+O(n log log n) bits of space for approximating any set of any size n, without having to know n in advance is presented.
References
SHOWING 1-10 OF 57 REFERENCES
An Optimal Bloom Filter Replacement Based on Matrix Solving
- Computer ScienceCSR
- 2009
This work suggests a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters, and suggests a data structure that requires only nk bits space, has O (n) preprocessing time, and has a O (logn ) query time.
Static Dictionaries Supporting Rank
- Computer ScienceISAAC
- 1999
A static dictionary is a data structure for storing a subset S of a finite universe U so that membership queries can be answered efficiently and the rank of an element if found is found if found.
Efficient Minimal Perfect Hashing in Nearly Minimal Space
- Computer Science, MathematicsSTACS
- 2001
A simple randomized scheme that uses n log e+log log u+o(n+loglog u) bits and has constant evaluation time and O(n + log log u) expected construction time is presented.
On dynamic range reporting in one dimension
- Computer Science, MathematicsSTOC '05
- 2005
This work considers the problem of maintaining a dynamic set of integers and answering queries of the form: report a point (equivalently, all points) in a given interval and develops the first scheme for dynamic perfect hashing requiring sublinear space.
The Bloomier filter: an efficient data structure for static support lookup tables
- Computer ScienceSODA '04
- 2004
The Bloomier filter is introduced, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries and lower bounds are provided to prove the (near) optimality of the constructions.
LOW REDUNDANCY IN STATIC DICTIONARIES WITH CONSTANT QUERY TIME
- Computer Science
- 2001
It is shown that on a unit cost RAM with word size Θ(log |U |), a static dictionary for n-element sets with constant worst case query time can be obtained using B+O(log log |U|)+o(n) bits of storage, where B e is the minimum number of bits needed to represent all nelement subsets of U.
Space Efficient Hash Tables with Worst Case Constant Access Time
- Computer ScienceTheory of Computing Systems
- 2004
This is the first dictionary that has worst case constant access time and expected constant update time, works with (1 + ε)n space, and supports satellite information.
Balanced Allocation and Dictionaries with Tightly Packed Constant Size Bins
- MathematicsICALP
- 2005
It is shown that e> (2/e)d−−1 is sufficient to guarantee that with high probability each ball can be put into one of the two bins assigned to it, without any bin overflowing.
Compressed bloom filters
- Computer SciencePODC '01
- 2001
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications…
Efficient hashing with lookups in two memory accesses
- Computer ScienceSODA '05
- 2005
This work presents a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization, and analyzes the trade-off between the number of moves performed during inserts and the maximum load on a bucket.