Aggregating local descriptors into a compact image representation

Abstract

We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation. We then show how to jointly optimize the dimension reduction and the indexing algorithm, so that it best preserves the quality of vector comparison. The evaluation shows that our approach significantly outperforms the state of the art: the search accuracy is comparable to the bag-of-features approach for an image representation that fits in 20 bytes. Searching a 10 million image dataset takes about 50ms.

DOI: 10.1109/CVPR.2010.5540039

Extracted Key Phrases

8 Figures and Tables

01002003002008200920102011201220132014201520162017
Citations per Year

1,296 Citations

Semantic Scholar estimates that this publication has 1,296 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Jgou2010AggregatingLD, title={Aggregating local descriptors into a compact image representation}, author={Herv{\'e} J{\'e}gou and Matthijs Douze and Cordelia Schmid and Patrick P{\'e}rez}, journal={2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition}, year={2010}, pages={3304-3311} }