• Publications
  • Influence
A Robust and Efficient Video Representation for Action Recognition
This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features byExpand
  • 229
  • 39
Action and Event Recognition with Fisher Vectors on a Compact Feature Set
Action recognition in uncontrolled video is an important and challenging computer vision problem. Recent progress in this area is due to new local features and models that capture spatio-temporalExpand
  • 379
  • 26
Spatio-temporal Object Detection Proposals
Spatio-temporal detection of actions and events in video is a challenging problem. Besides the difficulties related to recognition, a major challenge for detection in video is the size of the searchExpand
  • 195
  • 20
The LEAR submission at Thumos 2014
We describe the submission of the INRIA LEAR team to the THU-MOS workshop in conjunction with ECCV 2014. Our system is based on Fisher vector (FV) encoding of dense trajectory features (DTF), whichExpand
  • 110
  • 5
The AXES submissions at TRECVID 2013
The AXES project participated in the interactive instance search task (INS), the semantic indexing task (SIN) the multimedia event recounting task (MER), and the multimedia event detection task (MED)Expand
  • 34
  • 5
Efficient Action Localization with Approximately Normalized Fisher Vectors
The Fisher vector (FV) representation is a high-dimensional extension of the popular bag-of-word representation. Transformation of the FV by power and ℓ2 normalizations has shown to significantlyExpand
  • 71
  • 4
AXES at TRECVID 2012: KIS, INS, and MED
The AXES project participated in the interactive instance search task (INS), the known-item search task (KIS), and the multimedia event detection task (MED) for TRECVid 2012. As in our TRECVid 2011Expand
  • 36
  • 2
The INRIA-LIM-VocR and AXES submissions to TrecVid 2014 Multimedia Event Detection
This paper describes our participation to the 2014 edition of the TrecVid Multimedia Event Detection task. Our system is based on a collection of local visual and audio descriptors, which areExpand
  • 6
  • 1
Robust and efficient models for action recognition and localization
Video interpretation and understanding is one of the long-term research goals in computer vision. Realistic videos such as movies present a variety of challenging machine learning problems, such asExpand
  • 1
  • 1
AXES at TRECVid 2013
The AXES project participated in the interactive instance search task (INS), the semantic indexing task (SIN) the multimedia event recounting task (MER), and the multimedia event detection task (MED)Expand
  • 1