Object retrieval with large vocabularies and fast spatial matching


In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, "web-scale " image corpora.

DOI: 10.1109/CVPR.2007.383172

Extracted Key Phrases

11 Figures and Tables

Citations per Year

2,157 Citations

Semantic Scholar estimates that this publication has 2,157 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Philbin2007ObjectRW, title={Object retrieval with large vocabularies and fast spatial matching}, author={James Philbin and Ondrej Chum and Michael Isard and Josef Sivic and Andrew Zisserman}, journal={2007 IEEE Conference on Computer Vision and Pattern Recognition}, year={2007}, pages={1-8} }