• Corpus ID: 33194189

Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms

  title={Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms},
  author={Erik Marchi and Florian Eyben and Gerhard Hagerer and Bj{\"o}rn Schuller},
We demonstrate audEERING’s sensAI technology running natively on low-resource mobile devices applied to emotion analytics and speaker characterisation tasks. A show-case application for the Android platform is provided, where audEERING’s highly noise robust voice activity detection based on LSTM-RNN is combined with our core emotion recognition and speaker characterisation engine natively on the mobile device. This eliminates the need for network connectivity and allows to perform robust… 

Tables from this paper

In Search of State and Trait Emotion Markers in Mobile-Sensed Language: Field Study

Background Emotions and mood are important for overall well-being. Therefore, the search for continuous, effortless emotion prediction methods is an important field of study. Mobile sensing provides

Depression Detection from Short Utterances via Diverse Smartphones in Natural Environmental Conditions

This paper conducts the first investigation of voice-based depression assessment techniques on real-world data from 887 speakers, recorded using a variety of different smartphones, and suggests that normalization based on these criteria may be more effective than tailored models for detecting depressed speech.

Detecting emotions from human speech: role of gender information

Speech-based emotion recognition systems are implemented to study the role of gender information and augmentation on system performance and indicate the efficacy of LSTM based approach specifically on female gender on augmented dataset.

"I have vxxx bxx connexxxn!": Facing Packet Loss in Deep Speech Emotion Recognition

In applications that use emotion recognition via speech, frame-loss can be a severe issue given manifold applications, where the audio stream loses some data frames, for a variety of reasons like low

Few-shot Learning in Emotion Recognition of Spontaneous Speech Using a Siamese Neural Network with Adaptive Sample Pair Formation

This work proposes a few-shot learning approach for automatically recognizing emotion in spontaneous speech from a small number of labelled samples through a siamese neural network, which models the relative distance between samples rather than relying on learning absolute patterns of the corresponding distributions of each emotion.

Learning Transferable Features for Speech Emotion Recognition

A deep architecture is proposed that jointly exploits a convolutional network for extracting domain-shared features and a long short-term memory network for classifying emotions using domain-specific features.

Latest Advances in Computational Speech Analysis for Mobile Sensing

Within this chapter, a selection of state-of-the-art speech analysis toolkits, which enable this research, are introduced and their advantages and limitations concerning mobile sensing are discussed.

Automatic analysis of voice emotions in think aloud usability evaluation: a case of online shopping

It is proposed that assessment of emotions of think aloud verbalisations is a moment-based approach for measuring emotional experiences, and the proposition and validation of an approach for automatic assessment of emotional experiences evoked during the think aloud protocol are validated.

A deep learning approach to prepare participants for negotiations by recognizing emotions with voice analysis

This work presents an approach for emotion recognition (sentiment detection) based on the characteristics of the human voice using methods from the field of deep learning, in specific convolutional neural networks, and shows that it is possible to determine emotions to a certain accuracy in real time.

Emotional expression in psychiatric conditions: New technology for clinicians

The aim of this paper is to review the currently used tools using modern technology and discuss their usefulness as well as possible future directions in emotional expression research and treatment strategies.



A Study of Speech Emotion Recognition and Its Application to Mobile Services

The experimental results indicate that the proposed method provides very stable and successful emotional classification performance as 72.5 % over five emotional states and it shows the feasibility of the agent for mobile communication services.

How's my mood and stress?: an efficient speech analysis library for unobtrusive monitoring on mobile phones

The AMMON (Affective and Mental health MONitor) library is described, a low footprint C library designed for widely available phones as an enabler of these applications.

EmotionSense: a mobile phones based adaptive platform for experimental social psychology research

It is shown how speakers and participants' emotions can be automatically detected by means of classifiers running locally on off-the-shelf mobile phones, and how speaking and interactions can be correlated with activity and location measures.

Recent developments in openSMILE, the munich open-source multimedia feature extractor

We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features

Recent developments and results of ASC-Inclusion: An Integrated Internet-Based Environment for Social Inclusion of Children with Autism Spectrum Conditions

The ASC-Inclusion project helps children with ASC by allowing them to learn how emotions can be expressed and recognised via playing games in a virtual world through facial expressions, tone of voice and body gestures.

What Should a Generic Emotion Markup Language Be Able to Represent?

A rich collection of use cases was compiled, and a structured set of requirements was distilled, which comprises the representation of the emotion-related state itself, some meta-information about that representation, various kinds of links to the "rest of the world", and several kinds of global metadata.

Distributing Recognition in Computational Paralinguistics

This paper conducts large-scale evaluations of some key functions, namely, feature compression/decompression, model training and classification, on five common paralinguistic tasks related to emotion, intoxication, pathology, age and gender and shows that, for most tasks, the recognition accuracies are very close to the baselines.

The INTERSPEECH 2015 computational paralinguistics challenge: nativeness, parkinson's & eating condition

Three sub-challenges are described: the estimation of the degree of nativeness, the neurological state of patients with Parkinson’s condition, and the eating conditions of speakers, i.

The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism

The INTERSPEECH 2013 Computational Paralinguistics Challenge provides for the first time a unified test-bed for Social Signals such as laughter in speech. It further introduces conflict in group

The INTERSPEECH 2014 computational paralinguistics challenge: cognitive & physical load

These two Sub-Challenges, their conditions, baseline results and experimental procedures, as well as the COMPARE baseline features generated with the openSMILE toolkit and provided to the participants in the Challenge are described.