Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice
@inproceedings{Mitra2019LeveragingAC, title={Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice}, author={V. Mitra and Sue Booker and E. Marchi and David Scott Farrar and Ute Dorothea Peitz and Bridget J. Cheng and Ermine A. Teves and A. Mehta and Devang Naik}, booktitle={INTERSPEECH}, year={2019} }
Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and much more. The expectation is that such assistants should understand the intent of the users query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions. A transcription driven approach can interpret what has been said but fails to acknowledge how it… CONTINUE READING
Figures, Tables, and Topics from this paper
Paper Mentions
Blog Post
2 Citations
Detecting Emotion Primitives from Speech and Their Use in Discerning Categorical Emotions
- Computer Science, Engineering
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- PDF
Attentive Modality Hopping Mechanism for Speech Emotion Recognition
- Computer Science, Mathematics
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- 7
- PDF
References
SHOWING 1-10 OF 18 REFERENCES
Investigating Utterance Level Representations for Detecting Intent from Acoustics
- Computer Science
- INTERSPEECH
- 2018
- 3
- PDF
Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech from Existing Podcast Recordings
- Computer Science
- IEEE Transactions on Affective Computing
- 2019
- 49
- PDF
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies
- Computer Science, Medicine
- IEEE Journal of Selected Topics in Signal Processing
- 2010
- 50
- PDF
Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models
- Computer Science
- 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
- 44
- PDF
Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition
- Computer Science
- Speech Commun.
- 2017
- 46
- PDF
Predicting Arousal and Valence from Waveforms and Spectrograms Using Deep Neural Networks
- Computer Science
- INTERSPEECH
- 2018
- 18
- PDF
Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing
- Computer Science
- 2013 IEEE Workshop on Automatic Speech Recognition and Understanding
- 2013
- 70
- PDF
Unveiling the Acoustic Properties that Describe the Valence Dimension
- Computer Science
- INTERSPEECH
- 2012
- 35
- PDF
A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling
- Computer Science
- NAACL-HLT
- 2018
- 72
- PDF