Shortlist Selection with Residual-Aware Distance Estimator for K-Nearest Neighbor Search

Abstract

In this paper, we introduce a novel shortlist computation algorithm for approximate, high-dimensional nearest neighbor search. Our method relies on a novel distance estimator: the residual-aware distance estimator, that accounts for the residual distances of data points to their respective quantized centroids, and uses it for accurate short-list computation. Furthermore, we perform the residual-aware distance estimation with little additional memory and computational cost through simple pre-computation methods for inverted index and multi-index schemes. Because it modifies the initial shortlist collection phase, our new algorithm is applicable to most inverted indexing methods that use vector quantization. We have tested the proposed method with the inverted index and multi-index on a diverse set of benchmarks including up to one billion data points with varying dimensions, and found that our method robustly improves the accuracy of shortlists (up to 127% relatively higher) over the state-of-the-art techniques with a comparable or even faster computational cost.

DOI: 10.1109/CVPR.2016.221

Extracted Key Phrases

8 Figures and Tables

Cite this paper

@article{Heo2016ShortlistSW, title={Shortlist Selection with Residual-Aware Distance Estimator for K-Nearest Neighbor Search}, author={Jae-Pil Heo and Zhe L. Lin and Xiaohui Shen and Jonathan Brandt and Sung-Eui Yoon}, journal={2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016}, pages={2009-2017} }