• Publications
  • Influence
IEMOCAP: interactive emotional dyadic motion capture database
TLDR
A new corpus named the “interactive emotional dyadic motion capture database” (IEMOCAP), collected by the Speech Analysis and Interpretation Laboratory at the University of Southern California (USC), which provides detailed information about their facial expressions and hand movements during scripted and spontaneous spoken communication scenarios. Expand
Analysis of emotion recognition using facial expressions, speech and multimodal information
TLDR
Results reveal that the system based on facial expression gave better performance than the systembased on just acoustic information for the emotions considered, and that when these two modalities are fused, the performance and the robustness of the emotion recognition system improve measurably. Expand
Emotion recognition based on phoneme classes
TLDR
It was found that (spectral properties of) vowel sounds were the best indicator to emotions in terms of the classification performance, and the best performance can be obtained by using phoneme-class classifiers over generic “emotional” HMM classifier and classifiers based on global prosodic features. Expand
Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces
TLDR
This paper presents a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. Expand
An acoustic study of emotions expressed in speech
TLDR
Results show that happiness/anger and neutral/sadness share similar acoustic properties in this speaker, suggesting that conventional acoustic parameters examined in this study are not effective in describing the emotions along the valence (or pleasure) dimension. Expand
Expressive speech synthesis using a concatenative synthesizer
TLDR
The highest recognition accuracies were achieved for sentences synthesized by using prosody and diphone inventory belonging to the same emotion, and anger was classified as inventory dominant and sadness and neutral as prosody dominant. Expand
Enhancing physical activity through context-aware coaching
TLDR
The results of this field study suggest that context- Aware coaching is effective, even though the difference between the control and context-aware participants was not statistically significant. Expand
Limited domain synthesis of expressive military speech for animated characters
TLDR
Preliminary results in an effort aimed at synthesizing expressive military speech for training applications using samples of expressive speech, classified according to speaking style are presented. Expand
Constructing emotional speech synthesizers with limited speech database
TLDR
An emotional speech synthesis technique based on HMMs is proposed, especially for the case where only limited amount of training data is available, directly incorporating subjective evaluation results performed on the training data. Expand
Camera-based heart rate monitoring in highly dynamic light conditions
TLDR
An infrared-based alternative for light-robust camera-based heart rate measurements suitable for automotive applications is presented and enables new applications in the automotive field, especially since heart rate measurement can be integrated with other camera- based driver monitoring solutions like eye tracking. Expand
...
1
2
3
4
...