Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition

@article{Lan2015BeyondGP,
  title={Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition},
  author={Zhenzhong Lan and Ming Lin and Xuanchong Li and Alexander Hauptmann and Bhiksha Raj},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2015},
  pages={204-212}
}
  • Zhenzhong Lan, Ming Lin, +2 authors B. Raj
  • Published 2015
  • Computer Science, Mathematics
  • 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Most state-of-the-art action feature extractors involve differential operators, which act as highpass filters and tend to attenuate low frequency action information. This attenuation introduces bias to the resulting features and generates ill-conditioned feature matrices. The Gaussian Pyramid has been used as a feature enhancing technique that encodes scale-invariant characteristics into the feature space in an attempt to deal with this attenuation. However, at the core of the Gaussian Pyramid… Expand
Order-aware Convolutional Pooling for Video Based Action Recognition
TLDR
Inspired by the capacity of Convolutional Neural Networks in making use of the internal structure of images for information abstraction, this paper proposes to apply the temporal convolution operation to the frame-level representations to extract the dynamic information. Expand
A Spatio-temporal Hybrid Network for Action Recognition
TLDR
A novel action recognition algorithm is proposed by effectively fusing 2D and Pseudo-3D CNN to learn spatio-temporal features of video to address problems of traditional 3D convolution. Expand
Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition
TLDR
Results show that the proposed ActionS-ST-VLAD method is able to effectively pool useful deep features spatiotemporally, leading to the state-of-the-art performance for video-based action recognition. Expand
Multi-Level ResNets with Stacked SRUs for Action Recognition
TLDR
The first to apply SRU to distinguish actions and investigate the effect of diverse hyper-parameter settings aiming at recommending researchers the better choice ofHyper-parameters for using SRUs. Expand
Feature Sequence Representation Via Slow Feature Analysis For Action Classification
TLDR
The proposed method leverages the PCA-SFA projection vector to describe the sequence of even fewer frames by a fixed-dimensional video descriptor, capturing the essential temporal dynamics which is a slowly varying pattern embedded in the quickly varying input signals. Expand
Frame-skip Convolutional Neural Networks for action recognition
  • Yinan Liu, Q. Wu, L. Tang
  • Computer Science
  • 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
  • 2017
TLDR
A novel video dynamics mining strategy which takes advantage of the motion tracking in the video is proposed and a frame skip scheme is introduced to the ConvNets, it stacks different modalities of optical flow to build a novel motion representation. Expand
Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
TLDR
A novel architecture called deep alternative neural network (DANN) stacking alternative layers is introduced, which learns contexts of local features from the very beginning and helps to preserve hierarchical context evolutions which it is shown are essential to recognize similar actions. Expand
Tube ConvNets: Better exploiting motion for action recognition
TLDR
A novel framework called Tube ConvNets is introduced, by substituting action tubes for full frames to reduce this burden of Convolutional Networks and eliminate the distraction of irrelevant objects. Expand
Temporal Variance Analysis for Action Recognition
TLDR
Embedded in the improved dense trajectory framework, TVA for action recognition is proposed to extract appearance and motion features from gray using slow and fast filters, respectively and separately encode extracted local features with different temporal variances and concatenate all the encoded features as final features. Expand
Human action recognition by means of subtensor projections and dense trajectories
TLDR
Experiments on four different public datasets have shown that this technique improves IDTs performance and that the results outperform the ones obtained by most of the state-of-the-art techniques for action recognition. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 47 REFERENCES
DL-SFA: Deeply-Learned Slow Feature Analysis for Action Recognition
TLDR
This paper uses a two-layered SFA learning structure with 3D convolution and max pooling operations to scale up the method to large inputs and capture abstract and structural features from the video. Expand
Spatio-Temporal Laplacian Pyramid Coding for Action Recognition
TLDR
The proposed STLPC method achieves superb recognition rates on the KTH, the multiview IXMAS, the challenging UCF Sports, and the newly released HMDB51 datasets, and outperforms state of the art methods showing its great potential on action recognition. Expand
Action Recognition with Stacked Fisher Vectors
TLDR
Experimental results demonstrate the effectiveness of SFV, and the combination of the traditional FV and SFV outperforms state-of-the-art methods on these datasets with a large margin. Expand
Action and Event Recognition with Fisher Vectors on a Compact Feature Set
TLDR
This work finds that for basic action recognition and localization MBH features alone are enough for state-of-the-art performance, and for complex events it is found that SIFT and MFCC features provide complementary cues. Expand
Two-Stream Convolutional Networks for Action Recognition in Videos
TLDR
This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Expand
Better Exploiting Motion for Better Action Recognition
TLDR
It is established that adequately decomposing visual motion into dominant and residual motions, both in the extraction of the space-time trajectories and for the computation of descriptors, significantly improves action recognition algorithms. Expand
Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice
TLDR
A comprehensive study of all steps in BoVW and different fusion methods is provided, and a simple yet effective representation is proposed, called hybrid supervector, by exploring the complementarity of different BoVW frameworks with improved dense trajectories. Expand
Hierarchical spatio-temporal context modeling for action recognition
TLDR
This paper proposes to model the spatio-temporal context information in a hierarchical way, where three levels of context are exploited in ascending order of abstraction, and proposes to employ the Multiple Kernel Learning (MKL) technique to prune the kernels towards speedup in algorithm evaluation. Expand
Activity representation with motion hierarchies
TLDR
This paper introduces a spectral divisive clustering algorithm to efficiently extract a hierarchy over a large number of tracklets and provides an efficient positive definite kernel that computes the structural and visual similarity of two hierarchical decompositions by relying on models of their parent–child relations. Expand
Sampling Strategies for Real-Time Action Recognition
TLDR
A real-time action recognition system which integrates fast random sampling method with local spatio-temporal features extracted from a Local Part Model and a new method based on histogram intersection kernel is proposed to combine multiple channels of different descriptors. Expand
...
1
2
3
4
5
...