Learn More
A representational gap exists between low-level measurements (segmentation, object classification, tracking) and high-level understanding of video sequences. In this paper, we propose a novel representation of events in videos to bridge this gap, based on the CASE representation of natural languages. The proposed representation has three significant(More)
A framework for classification of meeting videos is proposed in this paper. Our goal is to utilize this framework to analyze human motion data to perform automatic meeting classification. We use a rule-based system and state machine to analyze the videos, utilize three levels of context hierarchy, namely movements (and their attributes), events(actions),(More)
In this paper, we model multi-agent events in terms of a temporally varying sequence of sub-events, and propose a novel approach for learning, detecting and representing events in videos. The proposed approach has three main steps. First, in order to learn the event structure from training videos, we automatically encode the sub-event dependency graph,(More)
This paper proposes a novel method for estimating the geospatial trajectory of a moving camera. The proposed method uses a set of reference images with known GPS (global positioning system) locations to recover the trajec-tory of a moving camera using geometric constraints. The proposed method has three main steps. First, scale invariant features transform(More)
Novel computer vision techniques using multiple commercial off-the-shelf (COTS) sensors can be used on mobile platforms to efficiently detect and track objects over large areas. Video sensor networks play a vital role in unattended wide area surveillance. Most of the computer vision research in this area deals with networks of stationary electro-optical(More)
This paper presents a novel object-based video coding framework for videos obtained from a static camera. As opposed to most existing methods, the proposed method does not require explicit 2D or 3D models of objects and hence is general enough to cater for varying types of objects in the scene. The proposed system detects and tracks objects in the scene and(More)
A representational gap exists between low-level measurements (segmentation, object classification, tracking) and high-level understanding of video sequences. In this paper, we propose a novel representation of events in videos to bridge this gap, based on the CASE representation of natural languages. The proposed representation has three significant(More)
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting(More)