An Investigation of Practical Approximate Nearest Neighbor Algorithms

Abstract

This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same randomprojection-based approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels.

Extracted Key Phrases

4 Figures and Tables

02040'05'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

297 Citations

Semantic Scholar estimates that this publication has 297 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Liu2004AnIO, title={An Investigation of Practical Approximate Nearest Neighbor Algorithms}, author={Ting Liu and Andrew W. Moore and Alexander G. Gray and Ke Yang}, booktitle={NIPS}, year={2004} }