Scalable forest hashing for fast similarity search


Indexing images and videos using binary hash bits has shown promising results for fast similarity search. Existing datadriven hashing methods learn compact hash codes from the data, but usually with the cost of generating unbalanced hash buckets, thus affecting the search efficiency. We propose a novel data-driven hashing method called forest hashing, which utilizes multiple tree structures to perform data hashing. By leveraging the index structure of trees, we can significantly improve the hashing efficacy by generating balanced hash buckets. Moreover, forest hashing naturally supports scalable coding where more trees can improve the coding quality with a longer code. Last but not the least, our forest hashing can be easily extended for semantic search by integrating semi-supervised label information. Experiments on two benchmark datasets show favorable results compared with the state-of-the-art hashing methods.

DOI: 10.1109/ICME.2014.6890219

Extracted Key Phrases

6 Figures and Tables

Cite this paper

@inproceedings{Yu2014ScalableFH, title={Scalable forest hashing for fast similarity search}, author={Gang Yu and Junsong Yuan}, booktitle={ICME}, year={2014} }