Discovering human interactions in videos with limited data labeling

Abstract

We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.

DOI: 10.1109/CVPRW.2015.7301278

Extracted Key Phrases

6 Figures and Tables

Cite this paper

@article{Khodabandeh2015DiscoveringHI, title={Discovering human interactions in videos with limited data labeling}, author={Mehran Khodabandeh and Arash Vahdat and Guang-Tong Zhou and Hossein Hajimirsadeghi and Mehrsan Javan Roshtkhari and Greg Mori and Stephen Se}, journal={2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)}, year={2015}, pages={9-18} }