Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

@article{zkan2017RelaxedSD,
  title={Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction},
  author={Savas {\"O}zkan and G. Akar},
  journal={2017 IEEE International Conference on Computer Vision Workshops (ICCVW)},
  year={2017},
  pages={3094-3100}
}
  • Savas Özkan, G. Akar
  • Published 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a… Expand
1 Citations
Deep Facial Expression Recognition: A Survey
  • 293
  • PDF

References

SHOWING 1-10 OF 30 REFERENCES
ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification
  • 273
  • PDF
Learnable pooling with Context Gating for video classification
  • 194
  • PDF
A discriminative CNN video representation for event detection
  • 392
  • PDF
Two-Stream Convolutional Networks for Action Recognition in Videos
  • 4,471
  • PDF
Long-Term Temporal Convolutions for Action Recognition
  • 573
  • PDF
FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition
  • H. Ding, S. Zhou, R. Chellappa
  • Computer Science
  • 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017)
  • 2017
  • 189
  • PDF
Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning
  • 336
  • Highly Influential
  • PDF
3D Convolutional Neural Networks for Human Action Recognition
  • 3,505
  • PDF
Long-term recurrent convolutional networks for visual recognition and description
  • 3,519
  • Highly Influential
  • PDF
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition
  • 781
  • Highly Influential
...
1
2
3
...