• Publications
  • Influence
ActivityNet: A large-scale video benchmark for human activity understanding
TLDR
This paper introduces ActivityNet, a new large-scale video benchmark for human activity understanding that aims at covering a wide range of complex human activities that are of interest to people in their daily living. Expand
Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification
TLDR
A framework for modeling motion by exploiting the temporal structure of the human activities, which represents activities as temporal compositions of motion segments, and shows that the algorithm performs better than other state of the art methods. Expand
Dense-Captioning Events in Videos
TLDR
This work proposes a new model that is able to identify all events in a single pass of the video while simultaneously describing the detected events with natural language, and introduces a new captioning module that uses contextual information from past and future events to jointly describe all events. Expand
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
TLDR
The approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence. Expand
Unsupervised Learning of Human Action Categories
TLDR
The approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence. Expand
A Hierarchical Model of Shape and Appearance for Human Action Classification
TLDR
A hierarchical model that can be characterized as a constellation of bags-of-features and that is able to combine both spatial and spatial-temporal features is proposed and shown to improve the classification performance over bag of feature models. Expand
DAPs: Deep Action Proposals for Action Understanding
TLDR
Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action proposals from long videos, is introduced, which outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize. Expand
SST: Single-Stream Temporal Action Proposals
TLDR
It is demonstrated empirically that the new Single-Stream Temporal Action Proposals model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Expand
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
TLDR
The Extended Connectionist Temporal Classification (ECTC) framework is introduced to efficiently evaluate all possible alignments via dynamic programming and explicitly enforce their consistency with frame-to-frame visual similarities. Expand
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
TLDR
A novel unsupervised learning method for human action categories that can recognize and localize multiple actions in long and complex video sequences containing multiple motions. Expand
...
1
2
3
4
5
...