No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
@article{Sainath2018NoNF, title={No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models}, author={T. Sainath and Rohit Prabhavalkar and Shankar Kumar and S. Lee and A. Kannan and David Rybach and Vlad Schogol and P. Nguyen and Bo Li and Y. Wu and Z. Chen and Chung-Cheng Chiu}, journal={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2018}, pages={5859-5863} }
For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since they remove the need for a separate expert-curated pronunciation lexicon to map from phoneme-based… CONTINUE READING
Figures, Tables, and Topics from this paper
30 Citations
Phoebe: Pronunciation-aware Contextualization for End-to-end Speech Recognition
- Computer Science
- ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
- 12
- PDF
On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition
- Computer Science
- INTERSPEECH
- 2019
- 36
- PDF
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
- Computer Science, Engineering
- INTERSPEECH
- 2019
- 8
- PDF
Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification
- Computer Science
- 2018 IEEE Spoken Language Technology Workshop (SLT)
- 2018
- 1
An Exploration of Directly Using Word as ACOUSTIC Modeling Unit for Speech Recognition
- Computer Science
- 2018 IEEE Spoken Language Technology Workshop (SLT)
- 2018
- 4
A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
- Computer Science, Engineering
- ICONIP
- 2018
- 39
- PDF
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition
- Computer Science, Engineering
- 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2019
- 28
- PDF
A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models.
- Computer Science, Engineering
- 2020
- PDF
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition
- Computer Science
- INTERSPEECH
- 2019
- 2
- PDF
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
- Computer Science, Engineering
- ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
- 1
- PDF
References
SHOWING 1-10 OF 26 REFERENCES
Acoustic data-driven pronunciation lexicon for large vocabulary speech recognition
- Computer Science
- 2013 IEEE Workshop on Automatic Speech Recognition and Understanding
- 2013
- 42
- PDF
Revisiting graphemes with increasing amounts of data
- Computer Science
- 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2009
- 17
- PDF
A Comparison of Sequence-to-Sequence Models for Speech Recognition
- Computer Science
- INTERSPEECH
- 2017
- 152
- PDF
From speech to letters - using a novel neural network architecture for grapheme based ASR
- Computer Science
- 2009 IEEE Workshop on Automatic Speech Recognition & Understanding
- 2009
- 36
- PDF
An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model
- Computer Science, Engineering
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
- 86
- PDF
Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer
- Computer Science, Engineering
- 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2017
- 163
- PDF
Learning Lexicons From Speech Using a Pronunciation Mixture Model
- Computer Science
- IEEE Transactions on Audio, Speech, and Language Processing
- 2013
- 48
- PDF
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
- Computer Science, Engineering
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
- 615
- PDF