Action Recognition with Improved Trajectories

Abstract

Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking into account camera motion to correct them. To estimate camera motion, we match feature points between frames using SURF descriptors and dense optical flow, which are shown to be complementary. These matches are, then, used to robustly estimate a homography with RANSAC. Human motion is in general different from camera motion and generates inconsistent matches. To improve the estimation, a human detector is employed to remove these matches. Given the estimated camera motion, we remove trajectories consistent with it. We also use this estimation to cancel out camera motion from the optical flow. This significantly improves motion-based descriptors, such as HOF and MBH. Experimental results on four challenging action datasets (i.e., Hollywood2, HMDB51, Olympic Sports and UCF50) significantly outperform the current state of the art.

DOI: 10.1109/ICCV.2013.441
View Slides

Extracted Key Phrases

9 Figures and Tables

020040020132014201520162017
Citations per Year

1,192 Citations

Semantic Scholar estimates that this publication has 1,192 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Wang2013ActionRW, title={Action Recognition with Improved Trajectories}, author={Heng Wang and Cordelia Schmid}, journal={2013 IEEE International Conference on Computer Vision}, year={2013}, pages={3551-3558} }