Learn More
In group activity recognition, the temporal dynamics of the whole activity can be inferred based on the dynamics of the individual people representing the activity. We build a deep model to capture these dynamics based on LSTM (long short-term memory) models. To make use of these observations, we present a 2-stage deep temporal model for the group activity(More)
This paper presents a deep neural-network-based hierarchical graphical model for individual and group activity recognition in surveillance scenes. Deep networks are used to recognize the actions of individual people in a scene. Next, a neural-network-based hierarchical graphical model refines the predicted labels for each class by considering dependencies(More)
A real world scene may contain several objects with different spatial and temporal characteristics. This paper proposes a novel method for the classification of natural scenes by processing both spatial and temporal information from the video. For extracting the spatial characteristics, we build spatial pyramids using the spatial pyramid matching (SPM)(More)
—In this paper we present an approach for classifying the activity performed by a group of people in a video sequence. This problem of group activity recognition can be addressed by examining individual person actions and their relations. Temporal dynamics exist both at the level of individual person actions as well as at the level of group activity. Given(More)
Given a video, there are many algorithms to separate static and dynamic objects present in the scene. The proposed work is focused on classifying the dynamic objects further as having either repetitive or non-repetitive motion. In this work, we propose a novel approach to achieve this challenging task by processing the optical flow fields corresponding to(More)
We present an algorithm for learning a feature representation for video segmentation. Standard video seg-mentation algorithms utilize similarity measurements in order to group related pixels. The contribution of our paper is an unsupervised method for learning the feature representation used for this similarity. The feature representation is defined over(More)
—This paper describes a novel approach for extraction of multiple objects from a given image of a natural scene. In the proposed approach, multiple objects are extracted by the application of saliency detection on the image. We use two distinct approaches for object extraction. One approach uses superpixels on the saliency map. Then the intensity of(More)
  • 1