Corpus ID: 16469354

Team SRI-Sarnoff's AURORA System @ TRECVID 2011

  title={Team SRI-Sarnoff's AURORA System @ TRECVID 2011},
  author={Hui Cheng and Amir Tamrakar and Saad Ali and Qian Yu and Omar Javed and Jingen Liu and Ajay Divakaran and Harpreet S. Sawhney and Alexander G. Hauptmann and Mubarak Shah and Subhabrata Bhattacharya and M. Witbrock and Jon Curtis and Gerald Friedland and Robert Mertens and Trevor Darrell and R. Manmatha and James Allan},
In this paper, we present results from the experimental evaluation for the TRECVID 2011 MED11 (Multimedia Event Detection) task as a part of Team SRI-Sarnoff's AURORA system being developed under the IARPA ALADDIN Program. Our approach employs two classes of content descriptions for describing videos depicting diverse events: (1) Low level features and their aggregates, and (2) Semantic concepts that capture scenes, objects and atomic actions that are local in space-time. In this presentation… Expand
Complex event recognition using constrained low-rank representation
This work proposes a novel low-rank formulation, which combines the precisely annotated videos used to train the concepts, with the rich concept scores, and finds a new representation for each event, which is not only low- rank, but also constrained to adhere to the concept annotation, thus suppressing the noise, and maintaining a consistent occurrence of the concepts in each event. Expand
High-level event recognition in unconstrained videos
While the existing solutions vary, common key modules are identified and detailed descriptions along with some insights for each are provided, including extraction and representation of low-level features across different modalities, classification strategies, fusion techniques, etc. Expand
Recognition of complex events in open-source web-scale videos: a bottom up approach
This symposium proposal presents a systematic decomposition of complex events into hierarchical components and makes an in-depth analysis of how existing research are being used to cater to various levels of this hierarchy. Expand
Complex Event Recognition Using Constrained Rank Optimization
This chapter discusses a low-rank formulation, which combines the precisely annotated videos used to train the concepts, with the rich concept scores, and demonstrates that the approach consistently improves the discriminativity of the concept scores by a significant margin. Expand
Recognition of Complex Events in Open-source Web-scale Videos: Features, Intermediate Representations and Their Temporal Interactions
Recognition of complex events in consumer uploaded Internet videos, captured under realworld settings, has emerged as a challenging area of research across both computer vision and multimediaExpand
Action recognition by graph embedding and temporal classifiers
A novel framework for selecting a set of prototypes from a labelled graph set taking class discrimination into account is created and Experimental results show that such a discriminative prototype selection framework can achieve superior results, not only for the task of human action recognition, but also in the classification of various structured data compared to other well-established prototype selection approaches. Expand
Exploiting probabilistic relationships between action concepts for complex event classification
A probabilistic framework that models the conditional relationships between the concepts and events and devise an approximate yet tractable solution to infer the posterior distribution to perform event classification is proposed. Expand
Research Statement –
The goal of complex event recognition [9–12] is to automatically detect high-level events in a given video sequence. However, due to the fast growing popularity of such videos, especially on the Web,Expand
Representing and Retrieving Video Shots in Human-Centric Brain Imaging Space
This paper investigates a novel methodology of representing and retrieving video shots using human-centric high-level features derived in brain imaging space (BIS) where brain responses to natural stimulus of video watching can be explored and interpreted. Expand


MoSIFT: Recognizing Human Actions in Surveillance Videos
This paper proposes an algorithm called MoSIFT, which detects interest points and encodes not only their local appearance but also explicitly models local motion, and introduces a bigram model to construct a correlation between local features to capture the more global structure of actions. Expand
Action recognition by dense trajectories
This work introduces a novel descriptor based on motion boundary histograms, which is robust to camera motion and consistently outperforms other state-of-the-art descriptors, in particular in uncontrolled realistic videos. Expand
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category. Expand
Space-time interest points
  • I. Laptev, T. Lindeberg
  • Computer Science
  • Proceedings Ninth IEEE International Conference on Computer Vision
  • 2003
This work builds on the idea of the Harris and Forstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time to detect spatio-temporal events. Expand
Performance evaluation of local colour invariants
Overall, the shadow invariants perform best: they are most robust to various imaging conditions while maintaining discriminative power and invariance of grey-value invariants to that of colour invariants. Expand
Scale & Affine Invariant Interest Point Detectors
A comparative evaluation of different detectors is presented and it is shown that the proposed approach for detecting interest points invariant to scale and affine transformations provides better results than existing methods. Expand
Random Forests
  • L. Breiman
  • Mathematics, Computer Science
  • Machine Learning
  • 2004
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression. Expand
Distinctive Image Features from Scale-Invariant Keypoints
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are ...
LIBSVM: A library for support vector machines
Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail. Expand
Lin LIBSVM : a library for support vector machines
  • ACM T-IST,
  • 2011