Action recognition by dense trajectories


Feature trajectories have shown to be efficient for representing videos. Typically, they are extracted using the KLT tracker or matching SIFT descriptors between frames. However, the quality as well as quantity of these trajectories is often not sufficient. Inspired by the recent success of dense sampling in image classification, we propose an approach to describe videos by dense trajectories. We sample dense points from each frame and track them based on displacement information from a dense optical flow field. Given a state-of-the-art optical flow algorithm, our trajectories are robust to fast irregular motions as well as shot boundaries. Additionally, dense trajectories cover the motion information in videos well. We, also, investigate how to design descriptors to encode the trajectory information. We introduce a novel descriptor based on motion boundary histograms, which is robust to camera motion. This descriptor consistently outperforms other state-of-the-art descriptors, in particular in uncontrolled realistic videos. We evaluate our video description in the context of action classification with a bag-of-features approach. Experimental results show a significant improvement over the state of the art on four datasets of varying difficulty, i.e. KTH, YouTube, Hollywood2 and UCF sports.

DOI: 10.1109/CVPR.2011.5995407

Extracted Key Phrases

9 Figures and Tables

Citations per Year

1,411 Citations

Semantic Scholar estimates that this publication has 1,411 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Wang2011ActionRB, title={Action recognition by dense trajectories}, author={Heng Wang and Alexander Kl{\"a}ser and Cordelia Schmid and Cheng-Lin Liu}, booktitle={CVPR}, year={2011} }