Approximate near neighbors for general symmetric norms

@article{Andoni2017ApproximateNN,
  title={Approximate near neighbors for general symmetric norms},
  author={Alexandr Andoni and Huy L. Nguyen and Aleksandar Nikolov and Ilya P. Razenshteyn and Erik Waingarten},
  journal={Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing},
  year={2017}
}
We show that every symmetric normed space admits an efficient nearest neighbor search data structure with doubly-logarithmic approximation. Specifically, for every n, d = no(1), and every d-dimensional symmetric norm ||·||, there exists a data structure for (loglogn)-approximate nearest neighbor search over ||·|| for n-point datasets achieving no(1) query time and n1+o(1) space. The main technical ingredient of the algorithm is a low-distortion embedding of a symmetric norm into a low… 

Data-dependent hashing via nonlinear spectral gaps

We establish a generic reduction from _nonlinear spectral gaps_ of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near

Fast Distance Oracles for Any Symmetric Norm

The main contribution is a fast (1 + ε) distance oracle for any symmetric norm ‖ · ‖l, which includes lp norms and Orlicz norms as special cases, as well as other norms used in practice, e.g. top-k norms, max-mixture and sum-mixtures of lpnorms, small-support norms and the boxnorm.

Near neighbor preserving dimension reduction for doubling subsets of $\ell_1$

The approach is to represent the pointset with a carefully chosen covering set, then randomly project the latter, and study two types of covering sets: $c-approximate $r-nets and randomly shifted grids, and the tradeoff between them in terms of preprocessing time and target dimension.

High-dimensional similarity search and sketching: algorithms and hardness

An algorithm for the ANN problem over the l1 and l2 distances that improves upon the Locality-Sensitive Hashing framework and establishes the equivalence between the existence of short and accurate sketches and good embeddings into lp spaces for 0 < p ≤ 2.

Hölder Homeomorphisms and Approximate Nearest Neighbors

Two new data structures for ANN over a general d-dimensional normed space are given, which for the first time achieve approximation d^o(1), thus improving upon the previous general bound O(sqrtd) that is directly implied by John's theorem.

Approximate Similarity Search Under Edit Distance Using Locality-Sensitive Hashing

This work achieves the first bounds for any approximation factor c, via a simple and easy-to-implement hash function, and shows how to apply these ideas to the closely-related Approximate Nearest Neighbor problem for edit distance, obtaining similar time bounds.

J ul 2 01 9 Approximate Similarity Search Under Edit Distance Using Locality-Sensitive Hashing

This work achieves the first bounds for any approximation factor c, via a simple and easy-to-implement hash function, and shows how to apply these ideas to the closely-related Approximate Nearest Neighbor problem for edit distance, obtaining similar time bounds.

O ct 2 01 9 Efficient Symmetric Norm Regression via Linear Sketching

This work provides efficient algorithms for overconstrained linear regression problems with size n×d when the loss function is a symmetric norm (a norm invariant under sign-flips and coordinatepermutations) and produces the first input-sparsity time algorithm with provable guarantees for the general class of symmetricnorm regression problem.

Hardness of Approximate Nearest Neighbor Search under L-infinity

It is shown that the approximation factor of $3$ is a barrier for any naive gadget reduction from the Orthogonal Vectors problem, and near-linear query time is required for any approximation factor less than $3 $ under $\ell_\infty$.

Revisiting kd-tree for Nearest Neighbor Search

Empirically validate the search accuracy and the query time guarantees of the proposed schemes, demonstrating the significantly improved scaling for same level of accuracy.

References

SHOWING 1-10 OF 42 REFERENCES

Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

  • Alexandr AndoniP. Indyk
  • Computer Science
    2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
  • 2006
We present an algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O(dn 1c2/+o(1)) and space O(dn + n1+1c2/+o(1)). This almost matches

Approximate nearest neighbors: towards removing the curse of dimensionality

Two algorithms for the approximate nearest neighbor problem in high-dimensional spaces are presented, which require space that is only polynomial in n and d, while achieving query times that are sub-linear inn and polynometric in d.

Optimal Data-Dependent Hashing for Approximate Near Neighbors

The new bound is not only optimal, but in fact improves over the best LSH data structures (Indyk, Motwani 1998) (Andoni, Indyk 2006) for all approximation factors c>1.

Finding nearest neighbors in growth-restricted metrics

This paper develops an efficient dynamic data structure for nearest neighbor queries in growth-constrained metrics that satisfy the property that for any point q and number r the ratio between numbers of points in balls of radius 2r and r is bounded by a constant.

Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality

Two algorithms for the approximate nearest neighbor problem in high dimensional spaces for data sets of size n living in IR are presented, achieving query times that are sub-linear in n and polynomial in d.

Approximate nearest neighbor algorithms for Frechet distance via product metrics

Several data structures using space (quasi)-polynomial in n and d, and query time sublinear in n, have been discovered for approximate NNS under l1 and l2 [14, 12, 11] and l1 [10] norms.

On Approximate Nearest Neighbors under linfinity Norm

  • P. Indyk
  • Computer Science, Mathematics
    J. Comput. Syst. Sci.
  • 2001
The nearest neighbor search (NNS) problem is the following: Given a set of n points P={p1, ?, pn} in some metric space X, preprocess P so as to efficiently answer queries which require finding a

Streaming symmetric norms via measure concentration

The general results are applied to easily derive bounds for several norms that were not studied before in the streaming model, including the top-k norm and the k-support norm, which was recently employed for machine learning tasks.

Beyond Locality-Sensitive Hashing

By a standard reduction, a new data structure is presented for the Hamming space and e1 norm with ρ ≤ 7/(8c)+ O(1/c3/2)+ oc(1), which is the first improvement over the result of Indyk and Motwani (STOC 1998).

Nearest Neighbor Queries in Metric Spaces

The preprocessing algorithm for M(S,Q) can be used to solve the all nearest neighbor problem for S in O(n(log n)2(log ϒ(S)2) expected time) and the resource bounds increase linearly in K .