Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition

  title={Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition},
  author={Xinzhou Xu and Jun Deng and Eduardo Coutinho and Chen Wu and Li Zhao and Bj{\"o}rn Schuller},
  journal={IEEE Transactions on Multimedia},
Speech emotion recognition (SER) is a powerful tool for endowing computers with the capacity to process information about the affective states of users in human–machine interactions. Recent research has shown the effectiveness of graph embedding-based subspace learning and extreme learning machine applied to SER, but there are still various drawbacks in these two techniques that limit their application. Regarding subspace learning, the change from linearity to nonlinearity is usually achieved… 

Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition

A novel transfer learning method called transfer sparse discriminant subspace learning (TSDSL) is proposed, which learns a common feature subspace of different corpora by introducing the discriminative learning and $\ell _{2,1}-$norm penalty, which can learn the most discriminating features across different Corpora.

Autonomous Emotion Learning in Speech: A View of Zero-Shot Speech Emotion Recognition

The experimental results indicate that zero-shot learning is a useful technique for autonomous speech-based emotion learning, achieving accuracies considerably better than chance level and an attribute-based gold-standard setup.

A Novel Video Emotion Recognition System in the Wild Using a Random Forest Classifier

The proposed framework to recognize seven human emotions by extracting robust visual features from the videos captured in the wild and handle the head pose variation using a new feature extraction technique is evaluated and obtained better accuracy than three existing video emotion recognition methods.

A multiple feature fusion framework for video emotion recognition in the wild

The results on both acted facial expressions in the Wild and MMI datasets demonstrate that the proposed method outperforms several counterpart video emotion recognition methods.

Meta-heuristic approach in neural network for stress detection in Marathi speech

This paper proposes SER using neural network classifier with weight optimization using fusion of optimization algorithms viz.

Negative correlation learning in the extreme learning machine framework

This work proposes an analytical solution to the parameters of the ELM base learners, which significantly reduce the computational burden of the standard NCL ensemble method, and statistically outperforms the comparison ensemble methods in accuracy.

Global convergence of Negative Correlation Extreme Learning Machine

The sufficient conditions to guarantee the global convergence of NCELM are presented and the update of the ensemble in each iteration is defined as a contraction mapping function, and through Banach theorem, global convergence the ensemble is proved.

Happy Emotion Recognition in Videos Via Apex Spotting and Temporal Models

  • N. SamadianiGuangyan Huang
  • Computer Science
    2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
  • 2020
This paper proposes a new single emotion recognition method to recognize happy emotion from key frames of facial expression videos that achieves higher accuracy than four counterpart methods in recognizing happy emotions.

Happy Emotion Recognition From Unconstrained Videos Using 3D Hybrid Deep Features

A Happy Emotion Recognition model using the 3D hybrid deep and distance features (HappyER-DDF) method to improve the accuracy by utilizing and extracting two different types of deep visual features.

Affective computing in the context of music therapy: a systematic review

A systematic review of the literature in the field of affective computing in the context of music therapy to assess AI methods to perform automatic emotion recognition applied to Human-Machine Musical Interfaces (HMMI).



Speech emotion recognition using deep neural network and extreme learning machine

The experimental results demonstrate that the proposed approach effectively learns emotional information from low-level features and leads to 20% relative accuracy improvement compared to the state of the art approaches.

Speech emotion recognition using transfer non-negative matrix factorization

In this paper, a novel transfer non-negative matrix factorization (TNMF) method is presented for cross-corpus speech emotion recognition, and the results verify that the TNMF method can significantly outperform the automatic and competitive methods.

Speech Emotion Analysis: Exploring the Role of Context

A novel set of features based on cepstrum analysis of pitch and intensity contours is introduced and the effects of different contexts on two different databases are systematically analyzed.

Compensating for speaker or lexical variabilities in speech for emotion recognition

Dimensionality reduction for speech emotion features by multiscale kernels

To achieve efficient and compact low-dimensional features for speech emotion recognition, a novel feature reduction method using multiscale kernels in the framework of graph embedding is proposed.

Spectral Regression for Efficient Regularized Subspace Learning

This paper proposes a novel dimensionality reduction framework, called spectral regression (SR), for efficient regularized subspace learning, which casts the problem of learning the projective functions into a regression framework, which avoids eigen-decomposition of dense matrices.

Graph Embedded Extreme Learning Machine

In this paper, we propose a novel extension of the extreme learning machine (ELM) algorithm for single-hidden layer feedforward neural network training that is able to incorporate subspace learning

Acoustic emotion recognition: A benchmark comparison of performances

The largest-to-date benchmark comparison under equal conditions on nine standard corpora in the field using the two pre-dominant paradigms is provided, finding large differences are found among corpora that mostly stem from naturalistic emotions and spontaneous speech vs. more prototypical events.

Local Receptive Fields Based Extreme Learning Machine

The general architecture of locally connected ELM is studied, showing that: 1) ELM theories are naturally valid for local connections, thus introducing local receptive fields to the input layer; 2) each hidden node in ELM can be a combination of several hidden nodes (a subnetwork), which is also consistent with ELM theory.