• Publications
  • Influence
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
This work introduces UCF101 which is currently the largest dataset of human actions and provides baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. Expand
Object tracking: A survey
The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects. Expand
Recognizing realistic actions from videos “in the wild”
This paper presents a systematic framework for recognizing realistic actions from videos “in the wild”, and uses motion statistics to acquire stable motion features and clean static features, and PageRank is used to mine the most informative static features. Expand
Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition
This paper generalizes the traditional MACH filter to video (3D spatiotemporal volume), and vector valued data, and analyzes the response of the filter in the frequency domain to avoid the high computational cost commonly incurred in template-based approaches. Expand
Multi-source Multi-scale Counting in Extremely Dense Crowd Images
This work relies on multiple sources such as low confidence head detections, repetition of texture elements, and frequency-domain analysis to estimate counts, along with confidence associated with observing individuals, in an image region, and employs a global consistency constraint on counts using Markov Random Field. Expand
A 3-dimensional sift descriptor and its application to action recognition
This paper uses a bag of words approach to represent videos, and presents a method to discover relationships between spatio-temporal words in order to better describe the video data. Expand
Recognizing 50 human action categories of web videos
  • K. Reddy, M. Shah
  • Computer Science
  • Machine Vision and Applications
  • 1 July 2013
This paper proposes using the scene context information obtained from moving and stationary pixels in the key frames, in conjunction with motion features, to solve the action recognition problem on a large (50 actions) dataset with videos from the web. Expand
Abnormal crowd behavior detection using social force model
A novel method to detect and localize abnormal behaviors in crowd videos using Social Force model and it is shown that the social force approach outperforms similar approaches based on pure optical flow. Expand
Visual Tracking: An Experimental Survey
It is demonstrated that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing, and it is found that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score. Expand
Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds
A novel approach is proposed that simultaneously solves the problems of counting, density map estimation and localization of people in a given dense crowd image and significantly outperforms state-of-the-art on the new dataset, which is the most challenging dataset with the largest number of crowd annotations in the most diverse set of scenes. Expand