TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for Real-time Video Facial Expression Recognition

  title={TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for Real-time Video Facial Expression Recognition},
  author={James Ren Hou Lee and Alexander Wong},
  journal={2020 17th Conference on Computer and Robot Vision (CRV)},
  • J. LeeA. Wong
  • Published 3 March 2020
  • Computer Science
  • 2020 17th Conference on Computer and Robot Vision (CRV)
A core challenge faced by the majority of individuals with Autism Spectrum Disorder (ASD) is an impaired ability to infer other people’s emotions based on their facial expressions. With significant recent advances in machine learning, one potential approach to leveraging technology to assist such individuals to better recognize facial expressions and reduce the risk of possible loneliness and depression due to social isolation is the design of computer vision-driven facial expression… 

Figures and Tables from this paper

AEGIS: A real-time multimodal augmented reality computer vision based system to assist facial expression recognition for individuals with autism spectrum disorder

A multimodal augmented reality system which combines the use of computer vision and deep convolutional neural networks in order to assist individuals with the detection and interpretation of facial expressions in social settings and can assist individuals living with ASD to learn to better identify expressions and thus improve their social experiences.

Deep learning techniques for automated detection of autism spectrum disorder based on thermal imaging

Thermal imaging was used to obtain the temperature of specific facial regions such as the eyes, cheek, forehead and nose while the authors evoked emotions in children using an audio-visual stimulus and the accuracy obtained was 96% and 90% respectively.

Interpretable Emotion Classification Using Temporal Convolutional Models

This work examines two spatiotemporal representations of moving faces, while expressing different emotions, and presents an interpretable technique for understanding the dynamics that occur during convolutional-based prediction tasks on sequences of face data.

Robust Lightweight Facial Expression Recognition Network with Label Distribution Training

This paper presents an efficiently robust facial expression recognition (FER) network, named EfficientFace, which holds much fewer parameters but more robust to the FER in the wild, and introduces a simple but efficient label distribution learning (LDL) method as a novel training strategy.



A Deep Spatial and Temporal Aggregation Framework for Video-Based Facial Expression Recognition

The main contribution of this project is the design of a novel, trainable deep neural network framework that fuses spatial information and temporal information of video according to CNNs and LSTMs for pattern recognition.

Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks

  • Behzad HassaniM. Mahoor
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2017
This paper proposes a 3D Convolutional Neural Network method for FER in videos that outperforms state-of-the-art methods and emphasizes on the importance of facial components rather than the facial regions that may not contribute significantly to generating facial expressions.

Real-time Convolutional Neural Networks for emotion and gender classification

It is argued that the careful implementation of modern CNN architectures, the use of the current regularization methods and the visualization of previously hidden features are necessary in order to reduce the gap between slow performances and real-time architectures.

Video-based emotion recognition using CNN-RNN and C3D hybrid networks

Extensive experiments show that combining RNN and C3D together can improve video-based emotion recognition noticeably, and are presented to the EmotiW 2016 Challenge.

Real time facial expression recognition in video using support vector machines

A real time approach to emotion recognition through facial expression in live video is presented, employing an automatic facial feature tracker to perform face localization and feature extraction and evaluating the method in terms of recognition accuracy.

Emotion recognition in the wild from videos using images

This paper presents the implementation details of the proposed solution to the Emotion Recognition in the Wild 2016 Challenge, in the category of video-based emotion recognition, which achieves 59.42% validation accuracy and improves the competition baseline of 38.81%.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression

The Cohn-Kanade (CK+) database is presented, with baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data.

Fully Automatic Recognition of the Temporal Phases of Facial Actions

  • M. ValstarM. Pantic
  • Computer Science
    IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
  • 2012
The proposed fully automatic method enables the detection of a much larger range of facial behavior by recognizing facial muscle actions [action units (AUs)] that compound expressions.

A Closer Look at Spatiotemporal Convolutions for Action Recognition

A new spatiotemporal convolutional block "R(2+1)D" is designed which produces CNNs that achieve results comparable or superior to the state-of-the-art on Sports-1M, Kinetics, UCF101, and HMDB51.