ST - T C SC - 2 00 0 - 03 Expected - Case Complexity of Approximate Nearest NeighborSearching

Abstract

Most research in algorithms for geometric query problems has focused on their worst-case performance. But when information on the query distribution is available, the alternative paradigm of designing and analyzing algorithms from the perspective of expected-case performance appears more attractive. We study the approximate nearest neighbor problem from this point of view. As a rst step in this direction, we assume that the query points are chosen uniformly from a hypercube that encloses all the data points; however, we make no assumption on the distribution of data points. We investigate three simple variants of partition trees: sliding-midpoint, balance-split, and hybrid-split trees. We show that with these simple tree-based data structures, it is possible to achieve linear space and logarithmic or polylogarithmic query time in the expected case. In contrast, the data structures known to achieve linear space and logarithmic query time in the worst case are complex, and algorithms on them run more slowly in practice. Moreover, for the sliding-midpoint tree, we prove that it achieves optimal expected query time under reasonable assumptions.

Cite this paper

@inproceedings{Aryay2000STT, title={ST - T C SC - 2 00 0 - 03 Expected - Case Complexity of Approximate Nearest NeighborSearching}, author={Sunil Aryay and H. D. Addy}, year={2000} }