• Corpus ID: 3157015

The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task

@inproceedings{Coutinho2014TheML,
  title={The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task},
  author={Eduardo Coutinho and Felix Weninger and Bj{\"o}rn Schuller and Klaus R. Scherer},
  booktitle={MediaEval},
  year={2014}
}
In this paper we describe TUM's approach for the MediaEval's \Emotion in Music" task. The goal of this task is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. Our system consists of Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression. We used two di erent sets of acoustic and psychoacoustic features that have been previously proven as e ective for emotion prediction in… 

Tables from this paper

Time-continuous Estimation of Emotion in Music with Recurrent Neural Networks
TLDR
The IRIT's approach used for the MediaEval 2015 "Emotion in Music" task was to predict two real-valued emotion dimensions, namely valence and arousal, in a time-continuous fashion using recurrent neural networks for their sequence modeling capabilities.
Automatically Estimating Emotion in Music with Deep Long-Short Term Memory Recurrent Neural Networks
TLDR
The method consists of deep Long-Short Term Memory Recurrent Neural Networks for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music.
Attentive RNNs for Continuous-time Emotion Prediction in Music Clips
TLDR
An attentive LSTM based approach for emotion prediction from music clips is described, postulate that attending to specific regions in the past gives the model, a better chance of predicting the emotions evoked by present notes.
A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction
TLDR
The experimental results demonstrated the effectiveness of the novel proposed multi-scale DBLSTM-ELM model, which achieved the best performance on the database of Emotion in Music task in MediaEval 2015 compared with other submitted results.
DBLSTM-based multi-scale fusion for dynamic emotion prediction in music
TLDR
A Deep Bidirectional Long Short-Term Memory (DBLSTM) based multi-scale regression method is proposed, and the result shows that this method achieves significant improvement when compared with the state-of-art methods.
Emotion and Themes Recognition in Music Utilising Convolutional and Recurrent Neural Networks
TLDR
This study presents a fusion system of end-to-end convolutional recurrent neural networks (CRNN) and pre-trained Convolutional feature extractors for music emotion and theme recognition.
Emotion in Music task: Lessons Learned
TLDR
The challenges faced and the solutions found were using crowdsourcing on Amazon Mechanical Turk to annotate a corpus of music pieces with continuous emotion annotations, where both time delay and demand for absolute ratings degraded the quality of the data.
Developing a benchmark for emotional analysis of music
TLDR
The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER, and a MediaEval Database for Emotional Analysis in Music is released.
Explaining Perceived Emotion Predictions in Music: An Attentive Approach
TLDR
Three attentive LSTM based attention models are described for dynamic emotion prediction from music clips and it is observed that the models attend to frames which contribute to changes in reported arousal-valence values and chroma to produce better emotion predictions, effectively capturing long-term dependencies.
Continuous Music Emotion Recognition Using Selected Audio Features
TLDR
The experiments show that a combination of the EiMME2015 baseline features, LP coefficients and proposed set of music features significantly increases system performance for both arousal and valence emotional dimensions.
...
1
2
3
...

References

SHOWING 1-7 OF 7 REFERENCES
Emotion in Music Task at MediaEval 2015
TLDR
The dataset consists of music licensed under Creative Commons from the Free Music Archive, which can be shared freely without restrictions, and the two required and optional runs are described.
Psychoacoustic cues to emotion in speech prosody and music
TLDR
It is shown that a significant part of the listeners’ second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness.
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common
TLDR
Starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, the worth of individual features across these three domains is interpreted, considering four audio databases with observer annotations in the arousal and valence dimensions, finding a high degree of cross-domain consistency in encoding the two main dimensions of affect.
Learning to forget: continual prediction with LSTM
TLDR
This work identifies a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends and proposes an adaptive "forget gate" that enables an L STM cell to learn to reset itself at appropriate times, thus releasing internal resources.
Recent developments in openSMILE, the munich open-source multimedia feature extractor
We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features
MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio
TLDR
An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox, an integrated set of functions written in Matlab dedicated to the extraction of musical features from audio files.
A MATLAB TOOLBOX FOR MUSICAL FEATURE EXTRACTION FROM AUDIO
TLDR
An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files.