Corpus ID: 232380109

Residual Energy-Based Models for End-to-End Speech Recognition

  title={Residual Energy-Based Models for End-to-End Speech Recognition},
  author={Qiujia Li and Yu Zhang and Bo Li and Liangliang Cao and P. Woodland},
  • Qiujia Li, Yu Zhang, +2 authors P. Woodland
  • Published 2021
  • Computer Science, Engineering
  • ArXiv
End-to-end models with auto-regressive decoders have shown impressive results for automatic speech recognition (ASR). These models formulate the sequence-level probability as a product of the conditional probabilities of all individual tokens given their histories. However, the performance of locally normalised models can be sub-optimal because of factors such as exposure bias. Consequently, the model distribution differs from the underlying data distribution. In this paper, the residual energy… Expand
1 Citations

Figures and Tables from this paper

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
  • PDF


Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
  • 3
  • PDF
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
  • C. Chiu, T. Sainath, +11 authors M. Bacchiani
  • Computer Science, Engineering
  • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2018
  • 664
  • PDF
Two-Pass End-to-End Speech Recognition
  • 45
  • PDF
An Evaluation of Word-Level Confidence Estimation for End-to-End Automatic Speech Recognition
  • 2
  • PDF
Confidence Measures in Encoder-Decoder Models for Speech Recognition
  • 2
  • PDF
Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models
  • 76
  • PDF
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
  • T. Sainath, Yanzhang He, +26 authors D. Zhao
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
  • 65
  • PDF
Learning Word-Level Confidence For Subword End-to-End ASR
  • 4
  • PDF
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
  • 20
  • PDF