• Publications
  • Influence
IEMOCAP: interactive emotional dyadic motion capture database
A new corpus named the “interactive emotional dyadic motion capture database” (IEMOCAP), collected by the Speech Analysis and Interpretation Laboratory at the University of Southern California (USC), which provides detailed information about their facial expressions and hand movements during scripted and spontaneous spoken communication scenarios. Expand
The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing
A basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis, is proposed and intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Expand
Analysis of emotion recognition using facial expressions, speech and multimodal information
Results reveal that the system based on facial expression gave better performance than the systembased on just acoustic information for the emotions considered, and that when these two modalities are fused, the performance and the robustness of the emotion recognition system improve measurably. Expand
Emotion recognition using a hierarchical binary decision tree approach
A hierarchical computational structure to recognize emotions is introduced that maps an input speech utterance into one of the multiple emotion classes through subsequent layers of binary classifications and is effective for classifying emotional utterances in multiple database contexts. Expand
Emotion recognition based on phoneme classes
It was found that (spectral properties of) vowel sounds were the best indicator to emotions in terms of the classification performance, and the best performance can be obtained by using phoneme-class classifiers over generic “emotional” HMM classifier and classifiers based on global prosodic features. Expand
MSP-IMPROV: An Acted Corpus of Dyadic Interactions to Study Emotion Perception
The MSP-IMPROV corpus is presented, a multimodal emotional database, where the goal is to have control over lexical content and emotion while also promoting naturalness in the recordings, leveraging the large size of the audiovisual database. Expand
Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection
An analysis of the statistics derived from the pitch contour indicates that gross pitch contours statistics such as mean, maximum, minimum, and range are more emotionally prominent than features describing the pitch shape. Expand
Domain Adversarial for Acoustic Emotion Recognition
  • M. Abdelwahab, C. Busso
  • Computer Science, Engineering
  • IEEE/ACM Transactions on Audio, Speech, and…
  • 20 April 2018
It is shown that exploiting unlabeled data consistently leads to better emotion recognition performance across all emotional dimensions, and the effect of adversarial training on the feature representation across the proposed deep learning architecture is visualize. Expand
Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study
The results suggest that emotional content affect the relationship between facial gestures and speech, and principal component analysis (PCA) shows that the audiovisual mapping parameters are grouped in a smaller subspace, which suggests that there is an emotion-dependent structure that is preserved from across sentences. Expand
Correcting Time-Continuous Emotional Labels by Modeling the Reaction Lag of Evaluators
A time-shift that maximizes the mutual information between the expressive behaviors and the time-continuous annotations is proposed, which is implemented by making different assumptions about the evaluators' reaction lag. Expand