Selecting a hashing algorithm

  title={Selecting a hashing algorithm},
  author={Bruce J. McKenzie and R. Harries and Tim Bell},
  journal={Software: Practice and Experience},
Hashing is so commonly used in computing that one might expect hash functions to be well understood, and that choosing a suitable function should not be difficult. The results of investigations into the performance of some widely used hashing algorithms are presented and it is shown that some of these algorithms are far from optimal. Recommendations are made for choosing a hashing algorithm and measuring its performance. 

Topics from this paper

Performance in Practice of String Hashing Functions
This paper describes a class of string hashing functions and shows that this class of hashing functions is reliable and eecient, and is therefore an appropriate choice for general-purpose hashing.
On the Distribution of Keys by Hashing
The distribution of keys by a hash function as used in hash search with chaining is studied by considering the distribution of keys a random function front keys to buckets would give. This model
Two Effective Functions on Hashing URL
The finding is that the well-known function for hashing sequence of symbols, ELFhash, is not very good in this regard, and the other two functions are better and thus recommended.
Recursive hashing functions for n-grams
This article generalizes recursive hash functions found in the literature and proposes new methods offering superior performance, demonstrating substantial speed improvement over conventional approaches, while retaining near-ideal hash value distribution.
Main-Memory Linear Hashing - Some Enhancements of Larson's Algorithm
Several modiications of the basic scheme of Linear Hashing are presented, together with performance measurements, and it is shown that the seemingly-popular hashpjw hashing function presented in "the dragon book" can perform fairly poorly if the so-called randomizing mod operation is omitted.
Key and Value Paired Data using Java Hash Table
This paper proposes use of a Java HashTable mechanism for Online Shopping Cart Application over the Web Application that employs a new and fast hash functions to implement hash tables and to avoid collisions.
Simplistic Hashing for Building a Better Bloom Filter on Randomized Data
This paper proposes an improvement to the randomized redundant data filtering by the support of an efficient hashing algorithm and evaluates these hashing algorithms for building a Bloom filter on randomized data.
Efficient In-memory Data Structures for n-grams Indexing
This paper describes several data structures, like hash table or B+ tree, that could store n-grams for searching and performs tests that shows their advantages and disadvantages.
Performance of the most common non‐cryptographic hash functions
Evaluated the most widely used NCHF using four criteria as follows: collision resistance, distribution of outputs, avalanche effect, and speed to assist practitioners and engineers in making more informed choices regarding which function to use for a particular problem.


A practical tool kit for making portable compilers
The Amsterdam Compiler Kit is an integrated collection of programs designed to simplify the task of producing portable (cross) compilers and interpreters. For each language to be compiled, a program
The Art of Computer Programming
The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
The Art of Computer Programming, Vol. 3: Sorting and Searching