# An optimal algorithm for approximate nearest neighbor searching fixed dimensions

@article{Arya1998AnOA, title={An optimal algorithm for approximate nearest neighbor searching fixed dimensions}, author={Sunil Arya and David M. Mount and Nathan S. Netanyahu and Ruth Silverman and Angela Y. Wu}, journal={J. ACM}, year={1998}, volume={45}, pages={891-923} }

Consider a set of <italic>S</italic> of <italic>n</italic> data points in real <italic>d</italic>-dimensional space, R<supscrpt>d</supscrpt>, where distances are measured using any Minkowski metric. In nearest neighbor searching, we preprocess <italic>S</italic> into a data structure, so that given any query point <italic>q</italic><inline-equation> <f>∈</f></inline-equation> R<supscrpt>d</supscrpt>, is the closest point of S to <italic>q</italic> can be reported quickly. Given any positive… Expand

#### Topics from this paper

#### 2,739 Citations

Distance browsing in spatial databases

- Computer Science
- TODS
- 1999

The incremental nearest neighbor algorithm significantly outperforms the existing k-nearest neighbor algorithm for distance browsing queries in a spatial database that uses the R-tree as a spatial index and it is proved informally that at any step in its execution the incremental nearest neighbors algorithm is optimal with respect to the spatial data structure that is employed. Expand

Approximate Nearest Neighbor under edit distance via product metrics

- Mathematics, Computer Science
- SODA '04
- 2004

To the knowledge, this is the first data structure for this problem with both query time and storage subexponential in <i>d</i> and the space requirement of this data structure is roughly<i>O</i>, i.e., strongly subexp exponential. Expand

Space-time tradeoffs for approximate nearest neighbor searching

- Mathematics, Computer Science
- JACM
- 2009

There is a single approach to nearest neighbor searching, which both improves upon existing results and spans the spectrum of space-time tradeoffs, and new algorithms for constructing AVDs and tools for analyzing their total space requirements are provided. Expand

Aggregate nearest neighbor queries in spatial databases

- Computer Science
- TODS
- 2005

If <i>Q</i> fits in memory and <i*P</i] is indexed by an R-tree, these algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results are developed. Expand

Efficient algorithms for substring near neighbor problem

- Mathematics, Computer Science
- SODA '06
- 2006

The problem of finding the approximate nearest neighbor when the data set points are the substrings of a given text <i>T</i> is considered and a data structure which does the following is presented. Expand

Space-time tradeoffs for approximate spherical range counting

- Mathematics, Computer Science
- SODA '05
- 2005

This work presents space-time tradeoffs for approximate spherical range counting queries, broadly based on methods developed for approximate Voronoi diagrams, but it involves a number of significant extensions from the context of nearest neighbor searching to range searching. Expand

Linear-size approximate voronoi diagrams

- Mathematics, Computer Science
- SODA '02
- 2002

It is shown that for a real parameter 2 ≤ γ ≤ 1/ε, it is possible to construct an AVD consisting of O(n) /ε(d) cells for T(i) = 1, and cells in these AVD are cubes or differences of two cubes. Expand

Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform

- Mathematics, Computer Science
- STOC '06
- 2006

A new low-distortion embedding of l<sub>2</sub><sup>d</sup> into l p (p=1,2) is introduced, called the Fast-Johnson-Linden-strauss-Transform (FJLT), based upon the preconditioning of a sparse projection matrix with a randomized Fourier transform. Expand

Entropy based nearest neighbor search in high dimensions

- Mathematics, Computer Science
- SODA '06
- 2006

The problem of finding the approximate nearest neighbor of a query point in the high dimensional space is studied, focusing on the Euclidean space, and it is shown that the <i>c</i> nearest neighbor can be computed in time and near linear space where <i*p</i><sup> ≈ 2.06/<i*c—i> becomes large. Expand

Approximate Nearest Neighbor Search Amid Higher-Dimensional Flats

- Mathematics, Computer Science
- ESA
- 2017

This work considers the approximate nearest neighbor (ANN) problem where the input set consists of n k-flats in the Euclidean R, for any fixed parameters 0 ≤ k < d, and presents an algorithm that achieves this task with nk+1(log(n)/ε)O(1) storage and preprocessing, and can answer a query in O(polylog( n) time. Expand

#### References

SHOWING 1-10 OF 92 REFERENCES

An algorithm for approximate closest-point queries

- Computer Science, Mathematics
- SCG '94
- 1994

An algorithm for approximately solving the post office problem, given n points in d dimensions, build a data structure so that, given a query point, a closest site to a querying point can be found quickly. Expand

A general approach to d-dimensional geometric queries

- Mathematics, Computer Science
- STOC '85
- 1985

It is shown that any bounded region in E = d can be divided into 2 subregions of equal volume in such a way that no hyperplane in E can intersect all 2 of the subRegions. Expand

Linear time algorithms for visibility and shortest path problems inside simple polygons

- Computer Science, Mathematics
- SCG '86
- 1986

We present linear time algorithms for solving the following problems involving a simple planar polygon <italic>P</italic>: (i) Computing the collection of all shortest paths inside <italic>P</italic>… Expand

Optimal algorithms for approximate clustering

- Mathematics, Computer Science
- STOC '88
- 1988

This work gives a polynomial time approximation scheme that estimates the optimal number of clusters under the second measure of cluster size within factors arbitrarily close to 1 for a fixed cluster size. Expand

Approximate nearest neighbor queries in fixed dimensions

- Mathematics, Computer Science
- SODA '93
- 1993

A practical variant of this algorithm is implemented, and it is shown empirically that for many point distributions this variant of the algorithm finds the nearest neighbor in moderately large dimension significantly faster than existing practical approaches. Expand

A Randomized Algorithm for Closest-Point Queries

- Mathematics, Computer Science
- SIAM J. Comput.
- 1988

This result approaches the $\Omega (n^{\lceil {{d / 2}} \rceil } )$ worst-case time required for any algorithm that constructs the Voronoi... Expand

Approximate nearest neighbor queries revisited

- Computer Science, Mathematics
- SCG '97
- 1997

New methods to answer approximate nearest neighbor queries on a set of n points in d -dimensional Euclidean space are proposed and applications to various proximity problems are discussed. Expand

Accounting for boundary effects in nearest neighbor searching

- Computer Science, Mathematics
- SCG '95
- 1995

An accurate analysis of the number of cells visited in nearest-neighbor searching by the bucketing andk-d tree algorithms is provided and empirical evidence is presented showing that the analysis applies even in low dimensions. Expand

Two algorithms for nearest-neighbor search in high dimensions

- Mathematics, Computer Science
- STOC '97
- 1997

A new approach to the nearest-neighbor problem is developed, based on a method for combining randomly chosen one-dimensional projections of the underlying point set, which results in an algorithm for finding e-approximate nearest neighbors with a query time of O((d log d)(d + log n)). Expand

A cost model for nearest neighbor search in high-dimensional data space

- Mathematics, Computer Science
- PODS '97
- 1997

A new cost model for nearest neighbor search in high-dimensional data space is developed which takes boundary effects into account and therefore also works in high dimensions and is applicable to different data distributions and index structures. Expand