• Publications
  • Influence
Opensmile: the munich versatile and fast open-source audio feature extractor
TLDR
The openSMILE feature extraction toolkit is introduced, which unites feature extraction algorithms from the speech processing and the Music Information Retrieval communities and has a modular, component based architecture which makes extensions via plug-ins easy.
The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing
TLDR
A basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis, is proposed and intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters.
Recent developments in openSMILE, the munich open-source multimedia feature extractor
We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features
AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge
TLDR
The fourth Audio-Visual Emotion recognition Challenge (AVEC 2014) is presented, using a subset of the tasks used in a previous challenge, allowing for more focussed studies and the performance of the baseline system on the two tasks is presented.
The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism
The INTERSPEECH 2013 Computational Paralinguistics Challenge provides for the first time a unified test-bed for Social Signals such as laughter in speech. It further introduces conflict in group
AVEC 2011-The First International Audio/Visual Emotion Challenge
The Audio/Visual Emotion Challenge and Workshop (AVEC 2011) is the first competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and
AVEC 2013: the continuous audio/visual emotion and depression recognition challenge
TLDR
The third Audio-Visual Emotion recognition Challenge (AVEC 2013) has two goals logically organised as sub-challenges: the first is to predict the continuous values of the affective dimensions valence and arousal at each moment in time, and the second is to Predict the value of a single depression indicator for each recording in the dataset.
AVEC 2012: the continuous audio/visual emotion challenge
TLDR
The challenge guidelines, the common data used, and the performance of the baseline system on the two tasks are presented.
The INTERSPEECH 2012 Speaker Trait Challenge
TLDR
The EPFL-CONF-174360 data indicate that speaker Traits and Likability are influenced by the environment and the speaker’s personality in terms of paralinguistics and personality.
Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies
TLDR
Results employing six standard databases in a cross-corpora evaluation experiment show the crucial performance inferiority of inter to intracorpus testing and investigates different types of normalization.
...
...