The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task
@inproceedings{Coutinho2014TheML, title={The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task}, author={Eduardo Coutinho and Felix Weninger and Bj{\"o}rn Schuller and Klaus R. Scherer}, booktitle={MediaEval}, year={2014} }
In this paper we describe TUM's approach for the MediaEval's \Emotion in Music" task. The goal of this task is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. Our system consists of Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression. We used two di erent sets of acoustic and psychoacoustic features that have been previously proven as e ective for emotion prediction in…
Tables from this paper
24 Citations
Time-continuous Estimation of Emotion in Music with Recurrent Neural Networks
- Computer ScienceMediaEval
- 2015
The IRIT's approach used for the MediaEval 2015 "Emotion in Music" task was to predict two real-valued emotion dimensions, namely valence and arousal, in a time-continuous fashion using recurrent neural networks for their sequence modeling capabilities.
Automatically Estimating Emotion in Music with Deep Long-Short Term Memory Recurrent Neural Networks
- Computer ScienceMediaEval
- 2015
The method consists of deep Long-Short Term Memory Recurrent Neural Networks for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music.
Attentive RNNs for Continuous-time Emotion Prediction in Music Clips
- Computer ScienceAffCon@AAAI
- 2020
An attentive LSTM based approach for emotion prediction from music clips is described, postulate that attending to specific regions in the past gives the model, a better chance of predicting the emotions evoked by present notes.
A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction
- Computer Science2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
The experimental results demonstrated the effectiveness of the novel proposed multi-scale DBLSTM-ELM model, which achieved the best performance on the database of Emotion in Music task in MediaEval 2015 compared with other submitted results.
DBLSTM-based multi-scale fusion for dynamic emotion prediction in music
- Computer Science2016 IEEE International Conference on Multimedia and Expo (ICME)
- 2016
A Deep Bidirectional Long Short-Term Memory (DBLSTM) based multi-scale regression method is proposed, and the result shows that this method achieves significant improvement when compared with the state-of-art methods.
Emotion and Themes Recognition in Music Utilising Convolutional and Recurrent Neural Networks
- Computer ScienceMediaEval
- 2019
This study presents a fusion system of end-to-end convolutional recurrent neural networks (CRNN) and pre-trained Convolutional feature extractors for music emotion and theme recognition.
Emotion in Music task: Lessons Learned
- Computer ScienceMediaEval
- 2016
The challenges faced and the solutions found were using crowdsourcing on Amazon Mechanical Turk to annotate a corpus of music pieces with continuous emotion annotations, where both time delay and demand for absolute ratings degraded the quality of the data.
Developing a benchmark for emotional analysis of music
- Computer SciencePloS one
- 2017
The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER, and a MediaEval Database for Emotional Analysis in Music is released.
Explaining Perceived Emotion Predictions in Music: An Attentive Approach
- Computer ScienceISMIR
- 2020
Three attentive LSTM based attention models are described for dynamic emotion prediction from music clips and it is observed that the models attend to frames which contribute to changes in reported arousal-valence values and chroma to produce better emotion predictions, effectively capturing long-term dependencies.
Continuous Music Emotion Recognition Using Selected Audio Features
- Computer Science2019 42nd International Conference on Telecommunications and Signal Processing (TSP)
- 2019
The experiments show that a combination of the EiMME2015 baseline features, LP coefficients and proposed set of music features significantly increases system performance for both arousal and valence emotional dimensions.
References
SHOWING 1-7 OF 7 REFERENCES
Emotion in Music Task at MediaEval 2015
- Computer ScienceMediaEval
- 2015
The dataset consists of music licensed under Creative Commons from the Free Music Archive, which can be shared freely without restrictions, and the two required and optional runs are described.
Psychoacoustic cues to emotion in speech prosody and music
- PsychologyCognition & emotion
- 2013
It is shown that a significant part of the listeners’ second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness.
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common
- PhysicsFront. Psychol.
- 2013
Starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, the worth of individual features across these three domains is interpreted, considering four audio databases with observer annotations in the arousal and valence dimensions, finding a high degree of cross-domain consistency in encoding the two main dimensions of affect.
Learning to forget: continual prediction with LSTM
- Computer Science
- 1999
This work identifies a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends and proposes an adaptive "forget gate" that enables an L STM cell to learn to reset itself at appropriate times, thus releasing internal resources.
Recent developments in openSMILE, the munich open-source multimedia feature extractor
- Computer ScienceACM Multimedia
- 2013
We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features…
MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio
- Computer ScienceISMIR
- 2007
An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox, an integrated set of functions written in Matlab dedicated to the extraction of musical features from audio files.
A MATLAB TOOLBOX FOR MUSICAL FEATURE EXTRACTION FROM AUDIO
- Computer Science
- 2007
An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files.