An optimal algorithm for approximate nearest neighbor searching fixed dimensions

@article{Arya1998AnOA,
  title={An optimal algorithm for approximate nearest neighbor searching fixed dimensions},
  author={Sunil Arya and David M. Mount and Nathan S. Netanyahu and Ruth Silverman and Angela Y. Wu},
  journal={J. ACM},
  year={1998},
  volume={45},
  pages={891-923}
}
Consider a set of <italic>S</italic> of <italic>n</italic> data points in real <italic>d</italic>-dimensional space, R<supscrpt>d</supscrpt>, where distances are measured using any Minkowski metric. In nearest neighbor searching, we preprocess <italic>S</italic> into a data structure, so that given any query point <italic>q</italic><inline-equation> <f>∈</f></inline-equation> R<supscrpt>d</supscrpt>, is the closest point of S to <italic>q</italic> can be reported quickly. Given any positive… Expand
Distance browsing in spatial databases
TLDR
The incremental nearest neighbor algorithm significantly outperforms the existing k-nearest neighbor algorithm for distance browsing queries in a spatial database that uses the R-tree as a spatial index and it is proved informally that at any step in its execution the incremental nearest neighbors algorithm is optimal with respect to the spatial data structure that is employed. Expand
Approximate Nearest Neighbor under edit distance via product metrics
  • P. Indyk
  • Mathematics, Computer Science
  • SODA '04
  • 2004
TLDR
To the knowledge, this is the first data structure for this problem with both query time and storage subexponential in <i>d</i> and the space requirement of this data structure is roughly<i>O</i>, i.e., strongly subexp exponential. Expand
Space-time tradeoffs for approximate nearest neighbor searching
TLDR
There is a single approach to nearest neighbor searching, which both improves upon existing results and spans the spectrum of space-time tradeoffs, and new algorithms for constructing AVDs and tools for analyzing their total space requirements are provided. Expand
Aggregate nearest neighbor queries in spatial databases
TLDR
If <i>Q</i> fits in memory and <i*P</i] is indexed by an R-tree, these algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results are developed. Expand
Efficient algorithms for substring near neighbor problem
TLDR
The problem of finding the approximate nearest neighbor when the data set points are the substrings of a given text <i>T</i> is considered and a data structure which does the following is presented. Expand
Space-time tradeoffs for approximate spherical range counting
TLDR
This work presents space-time tradeoffs for approximate spherical range counting queries, broadly based on methods developed for approximate Voronoi diagrams, but it involves a number of significant extensions from the context of nearest neighbor searching to range searching. Expand
Linear-size approximate voronoi diagrams
TLDR
It is shown that for a real parameter 2 ≤ γ ≤ 1/ε, it is possible to construct an AVD consisting of O(n) /ε(d) cells for T(i) = 1, and cells in these AVD are cubes or differences of two cubes. Expand
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
TLDR
A new low-distortion embedding of l<sub>2</sub><sup>d</sup> into l p (p=1,2) is introduced, called the Fast-Johnson-Linden-strauss-Transform (FJLT), based upon the preconditioning of a sparse projection matrix with a randomized Fourier transform. Expand
Entropy based nearest neighbor search in high dimensions
TLDR
The problem of finding the approximate nearest neighbor of a query point in the high dimensional space is studied, focusing on the Euclidean space, and it is shown that the <i>c</i> nearest neighbor can be computed in time and near linear space where <i*p</i><sup> ≈ 2.06/<i*c—i> becomes large. Expand
Approximate Nearest Neighbor Search Amid Higher-Dimensional Flats
TLDR
This work considers the approximate nearest neighbor (ANN) problem where the input set consists of n k-flats in the Euclidean R, for any fixed parameters 0 ≤ k < d, and presents an algorithm that achieves this task with nk+1(log(n)/ε)O(1) storage and preprocessing, and can answer a query in O(polylog( n) time. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 92 REFERENCES
An algorithm for approximate closest-point queries
TLDR
An algorithm for approximately solving the post office problem, given n points in d dimensions, build a data structure so that, given a query point, a closest site to a querying point can be found quickly. Expand
A general approach to d-dimensional geometric queries
TLDR
It is shown that any bounded region in E = d can be divided into 2 subregions of equal volume in such a way that no hyperplane in E can intersect all 2 of the subRegions. Expand
Linear time algorithms for visibility and shortest path problems inside simple polygons
We present linear time algorithms for solving the following problems involving a simple planar polygon <italic>P</italic>: (i) Computing the collection of all shortest paths inside <italic>P</italic>Expand
Optimal algorithms for approximate clustering
TLDR
This work gives a polynomial time approximation scheme that estimates the optimal number of clusters under the second measure of cluster size within factors arbitrarily close to 1 for a fixed cluster size. Expand
Approximate nearest neighbor queries in fixed dimensions
TLDR
A practical variant of this algorithm is implemented, and it is shown empirically that for many point distributions this variant of the algorithm finds the nearest neighbor in moderately large dimension significantly faster than existing practical approaches. Expand
A Randomized Algorithm for Closest-Point Queries
  • K. Clarkson
  • Mathematics, Computer Science
  • SIAM J. Comput.
  • 1988
TLDR
This result approaches the $\Omega (n^{\lceil {{d / 2}} \rceil } )$ worst-case time required for any algorithm that constructs the Voronoi... Expand
Approximate nearest neighbor queries revisited
TLDR
New methods to answer approximate nearest neighbor queries on a set of n points in d -dimensional Euclidean space are proposed and applications to various proximity problems are discussed. Expand
Accounting for boundary effects in nearest neighbor searching
TLDR
An accurate analysis of the number of cells visited in nearest-neighbor searching by the bucketing andk-d tree algorithms is provided and empirical evidence is presented showing that the analysis applies even in low dimensions. Expand
Two algorithms for nearest-neighbor search in high dimensions
TLDR
A new approach to the nearest-neighbor problem is developed, based on a method for combining randomly chosen one-dimensional projections of the underlying point set, which results in an algorithm for finding e-approximate nearest neighbors with a query time of O((d log d)(d + log n)). Expand
A cost model for nearest neighbor search in high-dimensional data space
TLDR
A new cost model for nearest neighbor search in high-dimensional data space is developed which takes boundary effects into account and therefore also works in high dimensions and is applicable to different data distributions and index structures. Expand
...
1
2
3
4
5
...