Non-linear frequency warping using constant-Q transformation for speech emotion recognition
@article{Singh2021NonlinearFW, title={Non-linear frequency warping using constant-Q transformation for speech emotion recognition}, author={Premjeet Singh and Goutam Saha and Md. Sahidullah}, journal={2021 International Conference on Computer Communication and Informatics (ICCCI)}, year={2021}, pages={1-6} }
In this work, we explore the constant-Q transform (CQT) for speech emotion recognition (SER). The CQT-based time-frequency analysis provides variable spectro-temporal resolution with higher frequency resolution at lower frequencies. Since lower-frequency regions of speech signal contain more emotion-related information than higher-frequency regions, the increased low-frequency resolution of CQT makes it more promising for SER than standard short-time Fourier transform (STFT). We present a…
6 Citations
Analysis of constant-Q filterbank based representations for speech emotion recognition
- Computer ScienceDigit. Signal Process.
- 2022
Modulation spectral features for speech emotion recognition using deep neural networks
- Computer ScienceSpeech Commun.
- 2023
Deep scattering network for speech emotion recognition
- Computer Science2021 29th European Signal Processing Conference (EUSIPCO)
- 2021
This paper introduces scattering transform for speech emotion recognition (SER), and investigates layer-wise scattering coefficients to analyse the importance of time shift and deformation stable scalogram and modulation spectrum coefficients for SER.
Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification
- Computer ScienceNeural processing letters
- 2022
A novel multi-frequency resolution (MFR) feature is proposed in this paper to solve the problem that the existing single frequency resolution time–frequency features of sound cannot effectively express the characteristics of multiple types of sound.
Evaluation of handcrafted features and learned representations for the classification of arrhythmia and congestive heart failure in ECG
- MedicineBiomed. Signal Process. Control.
- 2023
Convolution-Vision Transformer for Automatic Lung Sound Classification
- Computer Science2022 35th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)
- 2022
This work proposes a hybrid Convolution-Vision Transformer architecture that explores the usage of Convolutional with Vision Transformers in a single system and evaluates the effectiveness of this method on ICBHI 2017 database.
References
SHOWING 1-10 OF 37 REFERENCES
Amplitude-Frequency Analysis of Emotional Speech Using Transfer Learning and Classification of Spectrogram Images
- Environmental ScienceAdvances in Science, Technology and Engineering Systems Journal
- 2018
This study used an indirect approach to provide insights into the amplitude-frequency characteristics of different emotions in order to support the development of future, more efficiently differentiating SER methods.
Speech emotion recognition with deep convolutional neural networks
- Computer ScienceBiomed. Signal Process. Control.
- 2020
Formant position based weighted spectral features for emotion recognition
- Computer ScienceSpeech Commun.
- 2011
Synthetic speech detection using fundamental frequency variation and spectral features
- Computer ScienceComput. Speech Lang.
- 2018
A comparative study of traditional and newly proposed features for recognition of speech under stress
- EngineeringIEEE Trans. Speech Audio Process.
- 2000
The results show that unlike fast Fourier transform's (FFT) immunity to noise, the linear prediction power spectrum is more immune than FFT to stress as well as to a combination of a noisy and stressful environment.
Towards a standard set of acoustic features for the processing of emotion in speech.
- Psychology
- 2010
Researchers concerned with the automatic recognition of human emotion in speech have proposed a considerable variety of segmental and supra-segmental acoustic descriptors. These range from prosodic…
Multiscale Amplitude Feature and Significance of Enhanced Vocal Tract Information for Emotion Classification
- Computer ScienceIEEE Transactions on Cybernetics
- 2019
A novel multiscale amplitude feature is proposed using multiresolution analysis (MRA) and the significance of the vocal tract is investigated for emotion classification from the speech signal and the proposed feature outperforms the other features.
New approach in quantification of emotional intensity from the speech signal: emotional temperature
- Computer ScienceExpert Syst. Appl.
- 2015
The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing
- Computer ScienceIEEE Transactions on Affective Computing
- 2016
A basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis, is proposed and intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters.
Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
- Computer ScienceIEEE Transactions on Multimedia
- 2018
This paper explores how to utilize a DCNN to bridge the affective gap in speech signals, and finds that the DCNN model pretrained for image applications performs reasonably good in affective speech feature extraction.