Streaming End-to-end Speech Recognition for Mobile Devices
@article{He2019StreamingES, title={Streaming End-to-end Speech Recognition for Mobile Devices}, author={Yanzhang He and T. Sainath and Rohit Prabhavalkar and Ian McGraw and R. Alvarez and Ding Zhao and David Rybach and A. Kannan and Y. Wu and R. Pang and Qiao Liang and Deepti Bhatia and Yuan Shangguan and Bo Li and G. Pundak and K. Sim and Tom Bagby and Shuo-Yiin Chang and K. Rao and A. Gruenstein}, journal={ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2019}, pages={6381-6385} }
End-to-end (E2E) models, which directly predict output character sequences given input speech, are good candidates for on-device speech recognition. [...] Key Result In experimental evaluations, we find that the proposed approach can outperform a conventional CTC-based model in terms of both latency and accuracy in a number of evaluation categories.Expand Abstract
Figures, Tables, and Topics from this paper
Paper Mentions
News Article
183 Citations
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
- Computer Science
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- 51
- PDF
Towards Fast and Accurate Streaming End-To-End ASR
- Computer Science, Engineering
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- 29
- PDF
Recognizing Long-Form Speech Using Streaming End-to-End Models
- Computer Science, Engineering
- 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2019
- 25
- PDF
Using Speech Synthesis to Train End-To-End Spoken Language Understanding Models
- Computer Science, Engineering
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- 8
- PDF
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition
- Computer Science, Engineering
- INTERSPEECH
- 2020
- 2
- PDF
A Comparison of End-to-End Models for Long-Form Speech Recognition
- Computer Science, Engineering
- 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2019
- 25
- PDF
Multistate Encoding with End-To-End Speech RNN Transducer Network
- Computer Science
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
A review of on-device fully neural end-to-end automatic speech recognition algorithms
- Computer Science
- ArXiv
- 2020
- Highly Influenced
- PDF
References
SHOWING 1-10 OF 48 REFERENCES
EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding
- Computer Science
- 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
- 2015
- 542
- PDF
Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer
- Computer Science, Engineering
- 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2017
- 164
- PDF
Contextual Speech Recognition in End-to-end Neural Network Systems Using Beam Search
- Computer Science
- INTERSPEECH
- 2018
- 23
- PDF
Semi-supervised Training for End-to-end Models via Weak Distillation
- Computer Science
- ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
- 22
Personalized speech recognition on mobile devices
- Computer Science
- 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
- 113
- PDF
End-to-end attention-based large vocabulary speech recognition
- Computer Science
- 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
- 749
- PDF
Joint CTC-attention based end-to-end speech recognition using multi-task learning
- Computer Science
- 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2017
- 371
- PDF
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
- Computer Science, Engineering
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
- 616
- PDF