Multimodal Deep Convolutional Neural Network for Audio-Visual Emotion Recognition

  title={Multimodal Deep Convolutional Neural Network for Audio-Visual Emotion Recognition},
  author={Shiqing Zhang and Shiliang Zhang and Tiejun Huang and Wen Gao},
Emotion recognition is a challenging task because of the emotional gap between subjective emotion and the low-level audio-visual features. Inspired by the recent success of deep learning in bridging the semantic gap, this paper proposes to bridge the emotional gap based on a multimodal Deep Convolution Neural Network (DCNN), which fuses the audio and visual cues in a deep model. This multimodal DCNN is trained with two stages. First, two DCNN models pre-trained on large-scale image data are… CONTINUE READING
Highly Cited
This paper has 21 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 16 extracted citations

A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization

2018 International Joint Conference on Neural Networks (IJCNN) • 2018
View 2 Excerpts

Tibetan Micro-Blog Sentiment Analysis Based on Mixed Deep Learning

2018 International Conference on Audio, Language and Image Processing (ICALIP) • 2018
View 1 Excerpt

DCNN and DNN based multi-modal depression recognition

2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII) • 2017
View 1 Excerpt


Publications referenced by this paper.
Showing 1-2 of 2 references

Similar Papers

Loading similar papers…