Fast Locality-Sensitive Hashing for Approximate Near Neighbor Search

Abstract

The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution H over locality-sensitive hash functions that partition space. For a collection of n points, after preprocessing, the query time is dominated by O(n logn) evaluations of hash functions from H and O(n) hash table lookups and distance computations where ρ ∈ (0, 1) is determined by the locality-sensitivity properties of H. It follows from a recent result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive hash functions can be reduced to O(log2 n), leaving the query time to be dominated by O(n) distance computations and O(n logn) additional word-RAM operations. We state this result as a general framework and provide a simpler analysis showing that the number of lookups and distance computations closely match the Indyk-Motwani framework, making it a viable replacement in practice. Using ideas from another locality-sensitive hashing framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of additional word-RAM operations to O(n). 1998 ACM Subject Classification E.1 Data Structures, H.3.3 Information Search and Retrieval

3 Figures and Tables

Cite this paper

@article{Christiani2017FastLH, title={Fast Locality-Sensitive Hashing for Approximate Near Neighbor Search}, author={Tobias Christiani}, journal={CoRR}, year={2017}, volume={abs/1708.07586} }