• Corpus ID: 211068748

Locality-sensitive hashing in function spaces

  title={Locality-sensitive hashing in function spaces},
  author={Will Shand and Stephen Becker},
We discuss the problem of performing similarity search over function spaces. To perform search over such spaces in a reasonable amount of time, we use {\it locality-sensitive hashing} (LSH). We present two methods that allow LSH functions on $\mathbb{R}^N$ to be extended to $L^p$ spaces: one using function approximation in an orthonormal basis, and another using (quasi-)Monte Carlo-style techniques. We use the presented hashing schemes to construct an LSH family for Wasserstein distance over… 

Figures from this paper


Locality-sensitive hashing scheme based on p-stable distributions
A novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions that improves the running time of the earlier algorithm and yields the first known provably efficient approximate NN algorithm for the case p<1.
Locality-Sensitive Hashing for Finding Nearest Neighbors in Probability Distributions
This paper presents a novel LSH scheme adapted to angular distance for ANN search in high-dimensional probability distributions, and proposes a Sequential Interleaving algorithm based on the “Unbalance Effect” of Euclidean and angular metrics for probability distributions.
Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS)
Theoretical analysis and experimental evaluations show that the new scheme is significantly better than the original scheme for MIPS and can be efficiently solved using signed random projections.
Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search
This paper proposes a new indexing scheme called multi-probe LSH, built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table to achieve the same search quality.
Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)
We present the first provably sublinear time algorithm for approximate \emph{Maximum Inner Product Search} (MIPS). Our proposal is also the first hashing algorithm for searching with (un-normalized)
Similarity estimation techniques from rounding algorithms
It is shown that rounding algorithms for LPs and SDPs used in the context of approximation algorithms can be viewed as locality sensitive hashing schemes for several interesting collections of objects.
Fast feature selection and training for AdaBoost-based concept detection with large scale datasets
Experimental results reveal the method can significantly reduce the training time of the best learner searching procedure, and the performance of the method is comparable with the state-of-art methods.
Monte Carlo and Quasi-Monte Carlo Sampling
The Monte Carlo method.- Sampling from known distributions.- Pseudorandom number generators.- Variance reduction techniques.- Quasi-Monte Carlo constructions.- Using quasi-Monte Carlo constructions.-
Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling
Preface.- Primal and Dual Problems.- One-Dimensional Issues.- L^1 and L^infinity Theory.- Minimal Flows.- Wasserstein Spaces.- Numerical Methods.- Functionals over Probabilities.- Gradient Flows.-