Learning concept bundles for video search with complex queries

Abstract

Classifiers for primitive visual concepts like "car", "sky" have been well developed and widely used to support video search on simple queries. However, it is usually ineffective for complex queries like "one or more people at a table or desk with a computer visible", as they carry semantics far more complex and different from simply aggregating the meanings of their constituent primitive concepts. To facilitate video search of complex queries, we propose a higher-level semantic descriptor named "concept bundle", which integrates multiple primitive concepts, such as "(soccer, fighting)", "(lion, hunting, zebra)" etc, to describe the visual representation of the complex semantics. The proposed approach first automatically selects informative concept bundles. It then builds a novel concept bundle classifier based on multi-task learning by exploiting the relatedness between concept bundle and its primitive concepts. To model a complex query, it proposes an optimal selection strategy to select related primitive concepts and concept bundles by considering both their classifier performance and semantic relatedness with respect to the query. The final results are generated by fusing the individual results from these selected primitive concepts and concept bundles. Extensive experiments are conducted on two video datasets: TRECVID 2008 and YouTube datasets. The experimental results indicate that: (a) our concept bundle learning approach outperforms the state-of-the-art methods by at least 19% and 29% on TRECVID 2008 and YouTube datasets, respectively; and (b) the use of concept bundles can improve the search performance for complex queries by at least 37.5% on TRECVID 2008 and 52% on YouTube datasets.

DOI: 10.1145/2072298.2072357

Extracted Key Phrases

12 Figures and Tables

Cite this paper

@inproceedings{Yuan2011LearningCB, title={Learning concept bundles for video search with complex queries}, author={Jin Yuan and Zheng-Jun Zha and Yan-Tao Zheng and Meng Wang and Xiangdong Zhou and Tat-Seng Chua}, booktitle={ACM Multimedia}, year={2011} }