Metric Embedding into the Hamming Space with the n-Simplex Projection

  title={Metric Embedding into the Hamming Space with the n-Simplex Projection},
  author={Lucia Vadicamo and Vladimir Mic and F. Falchi and Pavel Zezula},
Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations specifically designed for this space. We propose a novel transformation technique that provides a good trade-off between the applicability and the… 
1 Citations

On the Similarity Search With Hamming Space Sketches

Various challenges of the similarity search with sketches in the Hamming space are addressed, including the definition of sketching transformation and efficient search algorithms that exploit sketches to speed up searching.



Hilbert Exclusion

It is shown that many common metric spaces, notably including those using Euclidean and Jensen-Shannon distances, also have a stronger property, sometimes called the four-point property, and one in particular, which is named the Hilbert Exclusion property, allows any indexing mechanism which uses hyperplane partitioning to perform better.

Supermetric Search

High-Dimensional Simplexes for Supermetric Search

The n-point property is a generalisation of triangle inequality where, for any \((n+1)\) objects in the space, there exists an n-dimensional simplex whose edge lengths correspond to the distances among the objects.

Selecting Sketches for Similarity Search

This work proposes a way to efficiently estimate the quality of sketches using just a small sample set of data based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.

Effective Proximity Retrieval by Ordering Permutations

A new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces is introduced to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects.

Asymmetric Distances for Binary Embeddings

This work proposes two general asymmetric distances that are applicable to a wide variety of embedding techniques including locality sensitive hashing (LSH), locality sensitive binary codes (LSBC), spectral hashing (SH), PCA embedding (PCA), PCAE with random rotations (PCAE-RR), and PCA with iterative quantization (PCae-ITQ).

Binary Sketches for Secondary Filtering

This article proposes an approach to enhancing the existing search techniques to significantly reduce the number of accessed data objects while preserving the quality of the search results, and provides a probabilistic model to tune the parameters of the sketch-based filtering separately for each query object.

Polysemous Codes

Polysemous codes are introduced, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance, and their design is inspired by algorithms introduced in the 90's to construct channel-optimized vector quantizers.

Aggregating local descriptors into a compact image representation

This work proposes a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation, and shows how to jointly optimize the dimension reduction and the indexing algorithm.

Binary Hashing for Approximate Nearest Neighbor Search on Big Data: A Survey

This paper illustrates the development of binary hashing research by proposing an overall and clear classification of them, and conducts extensive experiments to compare the performance of these methods on five famous and public data sets.