OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit

  title={OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit},
  author={Florian Eyben and Martin W{\"o}llmer and Bj{\"o}rn Schuller},
  journal={2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops},
  • F. Eyben, M. Wöllmer, Björn Schuller
  • Published 8 December 2009
  • Computer Science
  • 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops
Various open-source toolkits exist for speech recognition and speech processing. [] Key Method The components include audio recording and audio file reading, state-of-the-art paralinguistic feature extraction and plugable classification modules. In this paper we introduce the engine and extensive baseline results. Pre-trained models for four affect recognition tasks are included in the openEAR distribution. The engine is tailored for multi-threaded, incremental on-line processing of live input in real-time…

Figures and Tables from this paper

OpenMM: An Open-Source Multimodal Feature Extraction Tool
The OpenMM is built upon existing open-source repositories to present the first publicly available tool for multimodal feature extraction, and provides a pipeline for researchers to easily extract visual and acoustic features.
Robust Speech Emotion Recognition Under Different Encoding Conditions
This work shows that encoded audio still contains enough relevant information for robust SER, and investigates the effects of mismatched encoding conditions in the training and test set both for traditional machine learning algorithms built on hand-crafted features and modern end-toend methods.
Learning with synthesized speech for automatic emotion recognition
Synthesis of artificially generated speech can indeed be used for the recognition of human emotional speech.
Unsupervised domain adaptation for speech emotion recognition using PCANet
This paper proposes a novel feature transfer approach with PCANet (a deep network), which extracts both the domain-shared and thedomain-specific latent features to facilitate performance improvement in emotion recognition.
Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets
We propose an end-to-end affect recognition approach using a Convolutional Neural Network (CNN) that handles multiple languages, with applications to emotion and personality recognition from speech.
A novel feature extraction strategy for multi-stream robust emotion identification
This work investigates an effective feature extraction front-end for speech emotion recognition, which performs well in clean and noisy conditions and finds both PMVDR and SDC offer much better robustness in noisy condition, which is critical for real applications.
Opensmile: the munich versatile and fast open-source audio feature extractor
The openSMILE feature extraction toolkit is introduced, which unites feature extraction algorithms from the speech processing and the Music Information Retrieval communities and has a modular, component based architecture which makes extensions via plug-ins easy.
Speech Emotion Recognition Using Deep Learning on audio recordings
  • S. Suganya, E. Charles
  • Computer Science
    2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer)
  • 2019
This work proposes an end-to-end deep learning approach which applies deep neural network on a raw audio recording of speech directly to learn high-level representations from the audio waveform.
Speech emotion recognition with cross-lingual databases
The basic idea is that since the emotion recognition system is based on the acoustic features only, it is possible to combine data in different languages to improve the recognition accuracy, and proposes to apply histogram equalization as a data normalization method.
VoiLA: An Online Intelligent Speech Analysis and Collection Platform
VoiLA is a free web-based speech classification tool designed to educate users about state-of-the-art speech analysis paradigms and routinely provides accurate and informative voice analysis.


Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features
This work uses the same German database consisting of 6 basic emotions: sadness, boredom, neutral, anxiety, happiness, and anger, and focuses on the extraction of less but more adapted features for emotion recognition.
The INTERSPEECH 2009 emotion challenge
The challenge, the corpus, the features, and benchmark results of two popular approaches towards emotion recognition from speech, and the FAU Aibo Emotion Corpus are introduced.
Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies
A novel approach for continuous emotion recognition based on LongShort-TermMemoryRecurrentNeu-ral Networks which include modelling of long-range dependen-cies between observations and thus outperform techniques like Support-VectorRegression.
EmoVoice - A Framework for Online Recognition of Emotions from Voice
EmoVoice is presented, a framework for emotional speech corpus and classifier creation and for offline as well as real-time online speech emotion recognition and some applications and prototypes that already use the framework to track online emotional user states from voice information.
Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech
Two different strategies for robust discrimination of non-verbal vocalisations such as laughter, breathing, hesitation, and consent are discussed: dynamic modelling by a broad selection of diverse acoustic Low-Level-Descriptors vs. static modelling by projection of these via statistical functionals onto a 0.6k feature space with subsequent de-correlation.
A database of German emotional speech
A database of emotional speech that was evaluated in a perception test regarding the recognisability of emotions and their naturalness and can be accessed by the public via the internet.
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the
Support Vector Regression for Automatic Recognition of Spontaneous Emotions in Speech
Three continuous-valued emotion primitives are used to describe emotions, namely valence, activation, and dominance, and support vector machines are used in their application for regression (support vector regression, SVR).
Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space?
H hierarchical functionals based on automatic segmentation and their systematic generation as opposed to common expert-driven selection are discussed to cope with rapidly growing feature spaces iquest5k and two-stage compression based on SVM- SFFS is discussed.