Learn More
— Human emotional and cognitive states evolve with variable intensity and clarity through the course of social interactions and experiences, and they are continuously influenced by a variety of input multimodal information from the environment and the interaction participants. This has motivated the development of a new area within affective computing that(More)
Emotion expression is an essential part of human interaction. Rich emotional information is conveyed through the human face. In this study, we analyze detailed motion-captured facial information of ten speakers of both genders during emotional speech. We derive compact facial representations using methods motivated by Principal Component Analysis and(More)
Improvised acting is a viable technique to study human communication and to shed light into actors' creativity. The USC CreativeIT database provides a novel bridge between the study of theatrical improvisation and human expressive behavior in dyadic interaction. The theoretical design of the database is based on the well-established improvisation technique(More)
Speaker state recognition is a challenging problem due to speaker and context variability. Intoxication detection is an important area of paralinguistic speech research with potential real-world applications. In this work, we build upon a base set of various static acoustic features by proposing the combination of several different methods for this learning(More)
Emotion expression associated with human communication is known to be a multimodal process. In this work, we investigate the way that emotional information is conveyed by facial and vocal modalities, and how these modalities can be effectively combined to achieve improved emotion recognition accuracy. In particular, the behaviors of different facial regions(More)
Human expressive interactions are characterized by an ongoing unfolding of verbal and nonverbal cues. Such cues convey the inter-locutor's emotional state which is continuous and of variable intensity and clarity over time. In this paper, we examine the emotional content of body language cues describing a participant's posture, relative position and(More)
In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. Current approaches based on gener-ative or discriminative models have different but important shortcomings that limit their accuracy. In this paper we discuss these limitations and(More)
—Human emotional expression tends to evolve in a structured manner in the sense that certain emotional evolution patterns, i.e., anger to anger, are more probable than others, e.g., anger to happiness. Furthermore the perception of an emotional display can be affected by recent emotional displays. Therefore, the emotional content of past and future(More)
In this paper, we apply a context-sensitive technique for mul-timodal emotion recognition based on feature-level fusion of acoustic and visual cues. We use bidirectional Long Short-Term Memory (BLSTM) networks which, unlike most other emotion recognition approaches, exploit long-range contextual information for modeling the evolution of emotion within a(More)
Emotion is expressed and perceived through multiple modalities. In this work, we model face, voice and head movement cues for emotion recognition and we fuse classi¿ers using a Bayesian framework. The facial classi¿er is the best performing followed by the voice and head classi¿ers and the multiple modalities seem to carry complementary information,(More)