SST: Single-Stream Temporal Action Proposals

@article{Buch2017SSTST,
  title={SST: Single-Stream Temporal Action Proposals},
  author={S. Buch and Victor Escorcia and Chuanqi Shen and Bernard Ghanem and Juan Carlos Niebles},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017},
  pages={6373-6382}
}
Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. [...] Key Result Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.Expand
End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos
TLDR
This work introduces the new architecture for Single-Stream Temporal Action Detection (SS-TAD), which effectively integrates joint action detection with its semantic sub-tasks in a single unifying end-to-end framework. Expand
Temporal Attention Network for Action Proposal
TLDR
A Temporal Attention Network (TAN) model is introduced to adaptively combine clip-level features and form a compact and discriminative video representation to address state-of-the-art temporal action proposal methods. Expand
Temporal action proposal for online driver action monitoring using Dilated Convolutional Temporal Prediction Network
TLDR
A new Dilated Convolutional Temporal Prediction Network that features 1-D dilated convolution operation in a Residual network (ResNet)-like architecture for the generation of action proposals on orders of fractions of a second is proposed. Expand
Deep Point-wise Prediction for Action Temporal Proposal
TLDR
A simple and effective method for temporal action proposal generation, named Deep Point-wise Prediction (DPP), which simultaneously predicts the action existing possibility and the corresponding temporal locations, without the utilization of any handcrafted sliding window or grouping. Expand
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
TLDR
TAL-Net is proposed, an improved approach to temporal action localization in video that is inspired by the Faster RCNN object detection framework and achieves state-of-the-art performance for both action proposal and localization on THUMOS'14 detection benchmark and competitive performance on ActivityNet challenge. Expand
Learning Temporal Action Proposals With Fewer Labels
TLDR
This work proposes a semi-supervised learning algorithm specifically designed for training temporal action proposal networks and shows that this approach consistently matches or outperforms the fully supervised state-of-the-art approaches. Expand
Temporal Recurrent Networks for Online Action Detection
TLDR
A novel framework, the Temporal Recurrent Network (TRN), to model greater temporal context of each frame by simultaneously performing online action detection and anticipation of the immediate future and integrates both of these into a unified end-to-end architecture. Expand
Temporal Action Detection with Structured Segment Networks
TLDR
The structured segment network (SSN) is presented, a novel framework which models the temporal structure of each action instance via a structured temporal pyramid and introduces a decomposed discriminative model comprising two classifiers, respectively for classifying actions and determining completeness. Expand
Fully Convolutional Network for Multiscale Temporal Action Proposals
TLDR
A fully convolutional network to identify multiscale temporal action proposals (FCN-TAP) that utilizes only the temporal convolutions to retrieve accurate action proposals for video sequences is proposed. Expand
Multi-Scale Proposal Regression Network for Temporal Action Proposal Generation
TLDR
A novel network is introduced, named multi-scale proposal regression network (MPRN), for temporal action proposal generation, which takes encoding visual features as input and predict action scores for time points, in order to group them to generate rough proposals. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos
TLDR
This paper introduces a proposal method that aims to recover temporal segments containing actions in untrimmed videos and introduces a learning framework to represent and retrieve activity proposals. Expand
DAPs: Deep Action Proposals for Action Understanding
TLDR
Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action proposals from long videos, is introduced, which outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize. Expand
Fast action proposals for human action detection and search
  • Gang Yu, Junsong Yuan
  • Mathematics, Computer Science
  • 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
Experimental results on two challenging datasets, MSRII and UCF 101, validate the superior performance of the action proposals as well as competitive results on action detection and search. Expand
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs
TLDR
A novel loss function for the localization network is proposed to explicitly consider temporal overlap and achieve high temporal localization accuracy in untrimmed long videos. Expand
A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection
TLDR
This paper presents a multi-stream bi-directional recurrent neural network for fine-grained action detection that significantly outperforms state-of-the-art action detection methods on both datasets. Expand
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
TLDR
A novel variant of long short-term memory deep networks is defined for modeling these temporal relations via multiple input and output connections and it is shown that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction. Expand
Finding action tubes
TLDR
This work addresses the problem of action detection in videos using rich feature hierarchies derived from shape and kinematic cues and extracts spatio-temporal feature representations to build strong classifiers using Convolutional Neural Networks. Expand
APT: Action localization proposals from dense trajectories
TLDR
This paper proposes bypassing the segmentation step of existing proposals completely by generating proposals directly from the dense trajectories used to represent videos during classification, using an efficient proposal generation algorithm to handle the high number of trajectories in a video. Expand
Parsing Videos of Actions with Segmental Grammars
TLDR
This work describes simple grammars that capture hierarchical temporal structure while admitting inference with a finite-state-machine, which makes parsing linear time, constant storage, and naturally online easier. Expand
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
TLDR
This work proposes a simple yet effective method that takes weak video labels and noisy image labels as input, and generates localized action frames as output that are used to train action recognition models with long short-term memory networks. Expand
...
1
2
3
4
5
...