Multi-attribute Queries: To Merge or Not to Merge?


Users often have very specific visual content in mind that they are searching for. The most natural way to communicate this content to an image search engine is to use key-words that specify various properties or attributes of the content. A naive way of dealing with such multi-attribute queries is the following: train a classifier for each attribute independently, and then combine their scores on images to judge their fit to the query. We argue that this may not be the most effective or efficient approach. Conjunctions of attribute often correspond to very characteristic appearances. It would thus be beneficial to train classifiers that detect these conjunctions as a whole. But not all conjunctions result in such tight appearance clusters. So given a multi-attribute query, which conjunctions should we model? An exhaustive evaluation of all possible conjunctions would be time consuming. Hence we propose an optimization approach that identifies beneficial conjunctions without explicitly training the corresponding classifier. It reasons about geometric quantities that capture notions similar to intra- and inter-class variances. We exploit a discriminative binary space to compute these geometric quantities efficiently. Experimental results on two challenging datasets of objects and birds show that our proposed approach can improve performance significantly over several strong base-lines, while being an order of magnitude faster than exhaustively searching through all possible conjunctions.

DOI: 10.1109/CVPR.2013.425

Extracted Key Phrases

Cite this paper

@article{Rastegari2013MultiattributeQT, title={Multi-attribute Queries: To Merge or Not to Merge?}, author={Mohammad Rastegari and Ali Diba and Devi Parikh and Ali Farhadi}, journal={2013 IEEE Conference on Computer Vision and Pattern Recognition}, year={2013}, pages={3310-3317} }