Universal classes of hash functions (Extended Abstract)

@inproceedings{Carter1977UniversalCO,
  title={Universal classes of hash functions (Extended Abstract)},
  author={Larry Carter and Mark N. Wegman},
  booktitle={STOC '77},
  year={1977}
}
This paper gives an input independent average linear time algorithm for storage and retrieval on keys. The algorithm makes a random choice of hash function from a suitable class of hash functions. Given any sequence of inputs the expected time (averaging over all functions in the class) to store and retrieve elements is linear in the length of the sequence. The number of references to the data base required by the algorithm for any input is extremely close to the theoretical minimum for any… 

Analysis of a Universal Class of Hash Functions

TLDR
This paper uses linear algebraic methods to analyze the performance of several classes of hash functions, including the class H2 presented by Carter and Wegman, and shows that the probability of choosing a function from H which maps x to the same value as more than t other elements of S is no greater than min.

Graphs, Hypergraphs and Hashing

TLDR
An infinite family of efficient and practical algorithms for generating minimal perfect hash functions which allow an arbitrary order to be specified for the keys is presented, and it is shown that almost all members of the family are space and time optimal.

New classes and applications of hash functions

  • M. WegmanL. Carter
  • Computer Science, Mathematics
    20th Annual Symposium on Foundations of Computer Science (sfcs 1979)
  • 1979
TLDR
Several new classes of hash functions with certain desirable properties are exhibited, and two novel applications for hashing which make use of these functions are introduced, including a provably secure authentication techniques for sending messages over insecure lines.

Finding minimal perfect hash functions

TLDR
The procedure is to construct a set of graph models for the dictionary, then choose one of the models for use in constructing the minimal perfect hashing function, which relies on a backtracking algorithm for numbering the vertices of the graph.

Entropy-Learned Hashing: Constant Time Hashing with Controllable Uniformity

TLDR
Entropy-Learned Hashing models and estimates the randomness of the input data, and then creates data-specific hash functions that use only the parts of the data that are needed to differentiate the outputs, which dramatically reduces the computational cost of hashing and proves their output is similarly uniform to that of traditional hash functions.

The Power of Two Choices with Simple Tabulation

TLDR
This paper investigates the power of two choices when the hash functions h0 and h1 are implemented with simple tabulation, which is a very efficient hash function evaluated in constant time, and shows that the expected maximum load is lg lg n + O(1), just like with fully random hashing.

Bloom maps for big data

TLDR
A lower bound on the space required per key is given in terms of the entropy of the distribution over values and the error rate and a generalization of the Bloom filter, the Bloom map, is presented that achieves the lower bound up to a small constant factor.

From Independence to Expansion and Back Again

TLDR
While the previously most efficient construction needed time quasipolynomial in Siegel's lower bound, this paper's time bound is just a logarithmic factor from the lower bound.

On the Cryptographic Applications of Random Functions

Now that "random functions" can be efficiently constructed ([GGM]), we discuss some of their possible applications to cryptography: 1) Distributing unforgable ID numbers which can be locally verified

Fast and powerful hashing using tabulation

TLDR
Recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees are surveyed, including twisted tabulation, which yields an extremely fast pseudorandom number generator that is provably good for many classic randomized algorithms and data-structures.
...

References

SHOWING 1-10 OF 10 REFERENCES

Hashing LEMMAs on time complexities with applications to formula manipulation

In this paper, time complexities of operation on “sets” and “ordered n-tuples” based on a hashing table search technique are presented as “Hashing LEMMAs” and are applied to formula manipulation.

Extendible hashing—a fast access method for dynamic files

TLDR
This work studies, by analysis and simulation, the performance of extendible hashing and indicates that it provides an attractive alternative to other access methods, such as balanced trees.

Arithmetic complexity of unordered sparse polynomials

TLDR
It is shown that addition and subtraction of unordered sparse polynomials can be done in time m+n and multiplication in timem+n, showing that the visual demand of human users can be satisfied at the cost of a single sort of the final answer as opposed to sorting each intermediate result in the case of ordered representations.

The Design and Analysis of Computer Algorithms

TLDR
This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.

A Fast Monte-Carlo Test for Primality

TLDR
A uniform distribution a from a uniform distribution on the set 1, 2, 3, 4, 5 is a random number and if a and n are relatively prime, compute the residue varepsilon.

Computational complexity of probabilistic Turing machines

TLDR
It is shown how probabilisticlinear-bounded automata can simulate nondeterministic linear-bounding automata and an example is given of a function computable more quickly by Probabilistic Turing machines than by deterministic Turing machines.

Sorting and Searching

The first revision of this third volume is a survey of classical computer techniques for sorting and searching. It extends the treatment of data structures in Volume 1 to consider both large and

Multifont OCR Postprocessing System

TLDR
The result was the development of a software simulator which processed sequential fields generated by the Advanced Optical Character Reader, performed the four functions indicated above, and selected the correct alphabetic word from a dictionary of 62000 entries.

IBM journal of research and development: information for authors

TLDR
Background information about the IBM Journal of Research and Development is combined with guidelines for the preparation of Journal manuscripts to acquaint authors with the Journal as a primary, professional publication and to present suggestions to ease the work of author and editor in preparing clear, concise, and useful manuscripts.

Arithmetic Complexity of Unordered or Sparse Polynomials Proceedings of the 1976 ACM Symposium on Symbolic and Algebraic Computation

  • Arithmetic Complexity of Unordered or Sparse Polynomials Proceedings of the 1976 ACM Symposium on Symbolic and Algebraic Computation
  • 1976