Audio-Visual Emotion Recognition Using Neural Networks Learned with Hints

This paper presents a neural network (NN) based multimodal fusion classifier for automatic emotion recognition. The audio and visual channels provide complementary information, so we utilize features from three behavioral cues: frontal-view facial expression, profile-view facial expression and vocalization (audio). The problem of interest is to recognize…