SPLX-Perm: A Novel Permutation-Based Representation for Approximate Metric Search

  title={SPLX-Perm: A Novel Permutation-Based Representation for Approximate Metric Search},
  author={Lucia Vadicamo and Richard C. H. Connor and F. Falchi and Claudio Gennaro and Fausto Rabitti},
Many approaches for approximate metric search rely on a permutation-based representation of the original data objects. The main advantage of transforming metric objects into permutations is that the latter can be efficiently indexed and searched using data structures such as inverted-files and prefix trees. Typically, the permutation is obtained by ordering the identifiers of a set of pivots according to their distances to the object to be represented. In this paper, we present a novel approach… Expand
Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
This approach is particularly advantageous for extensive databases or expensive metric function, and reuse the distances computed in the permutations in the first stage, and hence the memory footprint of the index is not increased. Expand
Indexing Metric Spaces for Exact Similarity Search
Different strengths and weaknesses of different indexing techniques are revealed in order to offer guidance on selecting an appropriate indexing technique for a given setting, and directing the future research for metric indexes. Expand


Use of permutation prefixes for efficient and scalable approximate similarity search
  • Andrea Esuli
  • Mathematics, Computer Science
  • Inf. Process. Manag.
  • 2012
This work presents the Permutation Prefix Index, an index data structure that supports efficient approximate similarity search and shows how the effectiveness can easily reach optimal levels just by adopting two ''boosting'' strategies: multiple index search and multiple query search, which both have nice parallelization properties. Expand
MI-File: using inverted files for scalable approximate similarity search
A new efficient and accurate technique for generic approximate similarity searching, based on the use of inverted files, that enables us to use inverted files to obtain very efficiently a very small set of good candidates for the query result. Expand
PPP-Codes for Large-Scale Similarity Searching
This work proposes an indexing and search technique that can significantly reduce the candidate set size by combination of several space partitionings, and proposes a mapping of objects from a generic metric space onto main memory codes using several pivot spaces. Expand
Effective Proximity Retrieval by Ordering Permutations
A new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces is introduced to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects. Expand
Similarity Search - The Metric Space Approach
Similarity Search focuses on the state of the art in developing index structures for searching the metric space, and provides an extensive survey of specific techniques for a large range of applications. Expand
Some Theoretical and Experimental Observations on Permutation Spaces and Similarity Search
Permutation based approaches represent data objects as ordered lists of predefined reference objects as well as searching for data objects whose permutation representation is similar to the query one. Expand
The Inverted Multi-Index
Inverted multi-indices were able to significantly improve the speed of approximate nearest neighbor search on the dataset of 1 billion SIFT vectors compared to the best previously published systems, while achieving better recall and incurring only few percent of memory overhead. Expand
Supermetric Search
A full investigation into the use of the supermetric property within a variety of different hyperplane partition indexing structures is presented, and some more of its flexibility is shown by examining a search structure whose partition and exclusion conditions are tailored, at each node, to suit the individual reference points and data set present there. Expand
Comparing top k lists
Motivated by several applications, we introduce various distance measures between "top k lists." Some of these distance measures are metrics, while others are not. For each of these latter distanceExpand
Hilbert Exclusion: Improved Metric Search through Finite Isometric Embeddings
It is shown that many common metric spaces, notably including those using Euclidean and Jensen-Shannon distances, also have a stronger property, sometimes called the four-point property, and one in particular, which is named the Hilbert Exclusion property, allows any indexing mechanism which uses hyperplane partitioning to perform better. Expand