Polysemous Codes

@article{Douze2016PolysemousC,
  title={Polysemous Codes},
  author={Matthijs Douze and Herv{\'e} J{\'e}gou and Florent Perronnin},
  journal={ArXiv},
  year={2016},
  volume={abs/1609.01882}
}
This paper considers the problem of approximate nearest neighbor search in the compressed domain. We introduce polysemous codes, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance. Their design is inspired by algorithms introduced in the 90's to construct channel-optimized vector quantizers. At search time, this dual interpretation accelerates the search. Most of the indexed vectors are filtered out with… 
Derived Codebooks for High-Accuracy Nearest Neighbor Search
TLDR
A novel approach is proposed that allows 16-bit quantizers to offer the same response time as 8-bitquantizers, while still providing a boost of accuracy, in high-dimensional Nearest Neighbor search.
Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
TLDR
It is argued that the potential of the simple inverted index was not fully exploited in previous works and advocate its usage both for the highly-entangled deep descriptors and relatively disentangled SIFT descriptors.
PQTable: Nonexhaustive Fast Search for Product-Quantized Codes Using Hash Tables
TLDR
When the vectors are highly compressed, the proposed PQTable achieves one of the fastest search performances on a single CPU to date with significantly efficient memory usage (0.059-ms per query over <inline-formula><tex-math notation="LaTeX">$10^9$</tex- maths> data points with just 5.5-GB memory consumption).
Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval
TLDR
Experiments conducted on publicly available large-scale datasets show that this method consistently outperforms all compared state-of-the-art unsupervised hashing methods and that the reconstruction procedure can effectively boost the search accuracy with a minimal constant additional cost.
Bolt: Accelerated Data Mining with Fast Vector Compression
TLDR
A vector quantization algorithm that can compress vectors over 12x faster than existing techniques while also accelerating approximate vector operations such as distance and dot product computations by up to 10x is introduced.
Link and Code: Fast Indexing with Graphs and Compact Regression Codes
TLDR
This paper revisits similarity search approaches based on graph walks by considering the memory constraint required to index billions of images on a single server, and proposes a method based both on graph traversal and compact representations that outperforms the state of the art on operating points considering 64-128 bytes per vector.
Quarter-Point Product Quantization for approximate nearest neighbor search
TLDR
A novel codeword expansion method named Quarter-point Product Quantization (QPQ) is proposed to further minimize quantization distortions, by estimating the distances from the query points to the database points using the quarter points instead of the clustering centers.
Quicker ADC : Unlocking the Hidden Potential of Product Quantization With SIMD
TLDR
Quicker ADC is a generalization of Quick ADC not limited to PQ leveraging specific SIMD instructions andirregular product quantizers combining sub-quantizers of different granularity and split tables allowing lookup tables larger than registers are introduced.
Online multimedia retrieval on CPU-GPU platforms with adaptive work partition
TLDR
This parallel IVFADC implements an out-of-GPU memory execution scheme to use the GPU for databases in which the index does not fit in its memory, which is crucial for searching in very large databases.
Stochastic Generative Hashing
TLDR
This paper proposes a novel generative approach to learn hash functions through Minimum Description Length principle such that the learned hash codes maximally compress the dataset and can also be used to regenerate the inputs.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 61 REFERENCES
Approximate Nearest Neighbor Search by Residual Vector Quantization
TLDR
This paper introduces residual vector quantization based approaches that are appropriate for unstructured vectors that are compared to two state-of-the-art methods, spectral hashing and product quantization, on both structured and unstructuring datasets.
Sparse composite quantization
TLDR
Sparse composite quantization is developed, which constructs sparse dictionaries and the benefit is that the distance evaluation between the query and the dictionary element (a sparse vector) is accelerated using the efficient sparse vector operation, and thus the cost of distance table computation is reduced a lot.
Product Quantization for Nearest Neighbor Search
This paper introduces a product quantization-based approach for approximate nearest neighbor search. The idea is to decompose the space into a Cartesian product of low-dimensional subspaces and to
Stacked Quantizers for Compositional Vector Compression
TLDR
This paper observes that PQ and AQ are both compositional quantizers that lie on the extremes of the codebook dependence-independence assumption, and explores an intermediate approach that exploits a hierarchical structure in the codebooks, resulting in a method that achieves quantization error on par with or lower than AQ, while being several orders of magnitude faster.
Locality sensitive hashing: A comparison of hash function types and querying mechanisms
TLDR
This paper compares several families of space hashing functions in a real setup and reveals that unstructured quantizer significantly improves the accuracy of LSH, as it closely fits the data in the feature space.
Beyond “project and sign” for cosine estimation with binary codes
  • R. Balu, T. Furon, H. Jégou
  • Computer Science
    2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2014
TLDR
A better encoding strategy based on the quantization and reconstruction points of view and a novel asymmetric estimator for the cosine similarity to improve the quality of nearest neighbor search with binary codes are proposed.
Asymmetric Distances for Binary Embeddings
TLDR
This work proposes two general asymmetric distances that are applicable to a wide variety of embedding techniques including locality sensitive hashing (LSH), locality sensitive binary codes (LSBC), spectral hashing (SH), PCA embedding (PCA), PCAE with random rotations (PCAE-RR), and PCA with iterative quantization (PCae-ITQ).
Spectral Hashing
TLDR
The problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard and a spectral method is obtained whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian.
Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan
TLDR
A novel algorithm, PQ Fast Scan, is designed that transforms the cache-resident lookup tables into small tables, sized to fit SIMD registers, that allows (i) in-register lookups in place of cache accesses and (ii) an efficient SIMD implementation.
Composite Quantization for Approximate Nearest Neighbor Search
This paper presents a novel compact coding approach, composite quantization, for approximate nearest neighbor search. The idea is to use the composition of several elements selected from the
...
1
2
3
4
5
...