CTAP: Complementary Temporal Action Proposal Generation

@article{Gao2018CTAPCT,
  title={CTAP: Complementary Temporal Action Proposal Generation},
  author={J. Gao and Kan Chen and Ramakant Nevatia},
  journal={ArXiv},
  year={2018},
  volume={abs/1807.04821}
}
Temporal action proposal generation is an important task, akin to object proposals, temporal action proposals are intended to capture “clips” or temporal intervals in videos that are likely to contain an action. Previous methods can be divided to two groups: sliding window ranking and actionness score grouping. Sliding windows uniformly cover all segments in videos, but the temporal boundaries are imprecise; grouping based method may have more precise boundaries but it may omit some proposals… Expand
Boundary discrimination and proposal evaluation for temporal action proposal generation
TLDR
This work proposes a novel method that generates proposals by evaluating the continuity of video frames, and then locates the start and the end with low continuity, and outperforms the state-of-the-art proposal generation methods. Expand
Multi-Scale Proposal Regression Network for Temporal Action Proposal Generation
TLDR
A novel network is introduced, named multi-scale proposal regression network (MPRN), for temporal action proposal generation, which takes encoding visual features as input and predict action scores for time points, in order to group them to generate rough proposals. Expand
Fast Learning of Temporal Action Proposal via Dense Boundary Generator
TLDR
An efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals. Expand
Multi-Granularity Generator for Temporal Action Proposal
TLDR
Through temporally adjusting the segment proposals with fine-grained information based on frame actionness, MGG achieves the superior performance over state-of-the-art methods on the public THUMOS-14 and ActivityNet-1.3 datasets. Expand
Self-Similarity Action Proposal
TLDR
Self-Similarity Action Proposal (SSAP), a simple method that generates action proposals using the self-similarity of videos, achieves state-of-the-art performance on THUMOS14 and competitive results on ActivityNet v1.3. Expand
Complementary Boundary Estimation Network for Temporal Action Proposal Generation
TLDR
Complementary Boundary Estimation Network (CBEN) is proposed, an improved approach to temporal action proposal generation based on the framework of Boundary Sensitive Network that can achieve better performance than current mainstream methods on temporalaction proposal generation. Expand
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
TLDR
A Local-Global Temporal Encoder (LGTE) and Temporal Boundary Regressor (TBR) are designed to combine these two regression granularities in an end-to-end fashion, which achieves the precise boundaries and reliable confidence of proposals through progressive refinement. Expand
PCPCAD: Proposal Complementary Action Detector
TLDR
This paper presents a novel proposal complementary action detector (PCAD) to deal with video streams under continuous, untrimmed conditions and learns an efficient classifier to classify the generated proposals into different activities and refine their temporal boundaries at the same time. Expand
BMN: Boundary-Matching Network for Temporal Action Proposal Generation
TLDR
This work proposes an effective, efficient and end-to-end proposal generation method, named Boundary-Matching Network (BMN), which generates proposals with precise temporal boundaries as well as reliable confidence scores simultaneously, and can achieve state-of-the-art temporal action detection performance. Expand
Boundary Adjusted Network Based on Cosine Similarity for Temporal Action Proposal Generation
TLDR
Inspired by the similarity comparison in face recognition and the similarity of action in same action segment, a module is designed to compare the similarity for visual features extracted from visual feature encoder to generate high-quality proposals. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
TLDR
A novel Temporal Unit Regression Network (TURN) model, which jointly predicts action proposals and refines the temporal boundaries by temporal coordinate regression, and outperforms state-of-the-art performance on THUMOS-14 and ActivityNet datasets. Expand
Single Shot Temporal Action Detection
TLDR
This work proposes a novel Single Shot Action Detector (SSAD) network based on 1D temporal convolutional layers to skip the proposal generation step via directly detecting action instances in untrimmed video and empirically investigates into input feature types and fusion strategies to further improve detection accuracy. Expand
Temporal Action Detection with Structured Segment Networks
TLDR
The structured segment network (SSN) is presented, a novel framework which models the temporal structure of each action instance via a structured temporal pyramid and introduces a decomposed discriminative model comprising two classifiers, respectively for classifying actions and determining completeness. Expand
Cascaded Boundary Regression for Temporal Action Detection
TLDR
A two-stage temporal action detection pipeline with Cascaded Boundary Regression (CBR) model, which uses temporal coordinate regression to refine the temporal boundaries of the sliding windows to achieve state-of-the-art performance on both datasets. Expand
Temporal Action Localization with Pyramid of Score Distribution Features
TLDR
A Pyramid of Score Distribution Feature (PSDF) is proposed to capture the motion information at multiple resolutions centered at each detection window, which mitigates the influence of unknown action position and duration, and shows significant performance gain over previous detection approaches. Expand
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
TLDR
A novel Convolutional-De-Convolutional (CDC) network that places CDC filters on top of 3D ConvNets, which have been shown to be effective for abstracting action semantics but reduce the temporal length of the input data. Expand
SST: Single-Stream Temporal Action Proposals
TLDR
It is demonstrated empirically that the new Single-Stream Temporal Action Proposals model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Expand
DAPs: Deep Action Proposals for Action Understanding
TLDR
Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action proposals from long videos, is introduced, which outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize. Expand
Temporal Action Localization by Structured Maximal Sums
We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sumExpand
Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos
TLDR
This paper introduces a proposal method that aims to recover temporal segments containing actions in untrimmed videos and introduces a learning framework to represent and retrieve activity proposals. Expand
...
1
2
3
4
...