Metric Embedding into the Hamming Space with the n-Simplex Projection

  title={Metric Embedding into the Hamming Space with the n-Simplex Projection},
  author={Lucia Vadicamo and Vladimir Mic and F. Falchi and Pavel Zezula},
Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations specifically designed for this space. We propose a novel transformation technique that provides a good trade-off between the applicability and the… Expand
1 Citations

Figures and Topics from this paper

On the Similarity Search With Hamming Space Sketches
Various challenges of the similarity search with sketches in the Hamming space are addressed, including the definition of sketching transformation and efficient search algorithms that exploit sketches to speed up searching. Expand


Hilbert Exclusion: Improved Metric Search through Finite Isometric Embeddings
It is shown that many common metric spaces, notably including those using Euclidean and Jensen-Shannon distances, also have a stronger property, sometimes called the four-point property, and one in particular, which is named the Hilbert Exclusion property, allows any indexing mechanism which uses hyperplane partitioning to perform better. Expand
Supermetric Search
A full investigation into the use of the supermetric property within a variety of different hyperplane partition indexing structures is presented, and some more of its flexibility is shown by examining a search structure whose partition and exclusion conditions are tailored, at each node, to suit the individual reference points and data set present there. Expand
High-Dimensional Simplexes for Supermetric Search
The n-point property is a generalisation of triangle inequality where, for any \((n+1)\) objects in the space, there exists an n-dimensional simplex whose edge lengths correspond to the distances among the objects. Expand
Selecting Sketches for Similarity Search
This work proposes a way to efficiently estimate the quality of sketches using just a small sample set of data based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space. Expand
Effective Proximity Retrieval by Ordering Permutations
A new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces is introduced to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects. Expand
Asymmetric Distances for Binary Embeddings
This work proposes two general asymmetric distances that are applicable to a wide variety of embedding techniques including locality sensitive hashing (LSH), locality sensitive binary codes (LSBC), spectral hashing (SH), PCA embedding (PCA), PCAE with random rotations (PCAE-RR), and PCA with iterative quantization (PCae-ITQ). Expand
Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval
This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections by proposing a simple and efficient alternating minimization algorithm, dubbed iterative quantization (ITQ), and demonstrating an application of ITQ to learning binary attributes or "classemes" on the ImageNet data set. Expand
Binary Sketches for Secondary Filtering
This article proposes an approach to enhancing the existing search techniques to significantly reduce the number of accessed data objects while preserving the quality of the search results, and provides a probabilistic model to tune the parameters of the sketch-based filtering separately for each query object. Expand
Polysemous Codes
Polysemous codes are introduced, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance, and their design is inspired by algorithms introduced in the 90's to construct channel-optimized vector quantizers. Expand
Aggregating local descriptors into a compact image representation
This work proposes a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation, and shows how to jointly optimize the dimension reduction and the indexing algorithm. Expand