Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models

@article{Lu2019SpeechSA,
  title={Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models},
  author={Zhiyun Lu and Liangliang Cao and Yifan Zhang and Chung-Cheng Chiu and James Fan},
  journal={ArXiv},
  year={2019},
  volume={abs/1911.09762}
}
  • Zhiyun Lu, Liangliang Cao, +2 authors James Fan
  • Published in ICASSP 2019
  • Computer Science, Engineering
  • ArXiv
  • In this paper, we propose to use pre-trained features from end-to-end ASR models to solve the speech sentiment analysis problem as a down-stream task. We show that end-to-end ASR features, which integrate both acoustic and text information from speech, achieve promising results. We use RNN with self-attention as the sentiment classifier, which also provides an easy visualization through attention weights to help interpret model predictions. We use well benchmarked IEMOCAP dataset and a new… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 28 REFERENCES

    IEMOCAP: interactive emotional dyadic motion capture database

    VIEW 12 EXCERPTS
    HIGHLY INFLUENTIAL

    A comparison of sequenceto-sequence models for speech recognition

    • Rohit Prabhavalkar, Kanishka Rao, +3 authors Navdeep Jaitly
    • Interspeech, 2017, pp. 939–943.
    • 2017
    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL

    Sequence Transduction with Recurrent Neural Networks

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Attention-augmented End-to-end Multi-task Learning for Emotion Prediction from Speech

    DNN-based Emotion Recognition Based on Bottleneck Acoustic Features and Lexical Features

    • Eesung Kim, Jong Won Shin
    • Computer Science
    • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2019
    VIEW 3 EXCERPTS

    Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition

    VIEW 3 EXCERPTS

    Incorporating End-to-End Speech Recognition Models for Sentiment Analysis

    VIEW 1 EXCERPT

    SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

    VIEW 2 EXCERPTS

    Speech Emotion Classification Using Attention-Based LSTM

    VIEW 1 EXCERPT