An Algorithm for Finding Best Matches in Logarithmic Expected Time

@article{Friedman1977AnAF,
  title={An Algorithm for Finding Best Matches in Logarithmic Expected Time},
  author={Jerome H. Friedman and Jon Louis Bentley and Raphael A. Finkel},
  journal={ACM Trans. Math. Softw.},
  year={1977},
  volume={3},
  pages={209-226}
}
An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record. The computation required to organize the file is proportional to kNlogN. The expected number of records examined in each search is independent of the file size. The expected computation to perform each search is proportional to logN. Empirical evidence suggests that except for very small files, this… 

Figures from this paper

Complexity Analysis for Partitioning Nearest Neighbor Searching Algorithms
TLDR
The asymptotic expected number of operations to find the nearest neighbor is presented as a function of the average number of patterns per bucket n and is shown to contain a global minimum.
New techniques for best-match retrieval
TLDR
A scheme to answer best-match queries from a file containing a collection of objects to allow the optimum use of any given set of precomputed intrafile distances is described.
Nearest neighbor queries
TLDR
This paper presents an efficient branch-and-bound R-tree traversal algorithm to find the nearest neighbor object to a point, and then generalizes it to finding the k nearest neighbors.
A tree algorithm for nearest neighbor searching in document retrieval systems
TLDR
A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described, and this algorithm is compared with two other searching algorithms; sequential search and clustered search.
A survey of algorithms and data structures for range searching
TLDR
A set of “loGical structures” is described and ‘then their implementation in primary and secondary memories is discussed, and a set of algorithms for efficiently answering range queries are surveyed.
Proximity Matching Using Fixed-Queries Trees
TLDR
This work presents a new data structure, called the fixed-queries tree, for the problem of finding all elements of a fixed set that are close to a query element under some distance function.
The Efficiency of Using k-d Trees for Finding Nearest Neighbors in Discrete Space
...
...

References

SHOWING 1-10 OF 19 REFERENCES
Some approaches to best-match file searching
TLDR
Three file structures are presented together with their corresponding search algorithms, which are intended to reduce the number of comparisons required to achieve the desired result.
A Branch and Bound Algorithm for Computing k-Nearest Neighbors
TLDR
The method of branch and bound is implemented in the present algorithm to facilitate rapid calculation of the k-nearest neighbors, by eliminating the necesssity of calculating many distances.
An Algorithm for Finding Nearest Neighbors
An algorithm that finds the k nearest neighbors of a point, from a sample of size N in a d-dimensional space, with an expected number of distance calculations is described, its properties examined,
Multidimensional binary search trees used for associative searching
TLDR
The multidimensional binary search tree (or <italic>k-d tree) as a data structure for storage of information to be retrieved by associative searches is developed and it is shown to be quite efficient in its storage requirements.
Geometric complexity
  • M. Shamos
  • Mathematics, Computer Science
    STOC
  • 1975
TLDR
An effort is made to recast classical theorems into a useful computational form and analogies are developed between constructibility questions in Euclidean geometry and computability questions in modern computational complexity.
Optimization of k nearest neighbor density estimates
TLDR
Nonparametric density estimation using the k -nearest-neighbor approach is discussed and a functional form for the optimum k in terms of the sample size, the dimensionality of the observation space, and the underlying probability distribution is obtained.
Constructing Optimal Binary Decision Trees is NP-Complete
The Art of Computer Programming
TLDR
The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
On the Optimality of Elia's Algorithm for Performing Best-Match Searches
Numerical Computing and Mathematical Analysis
...
...