Output-Gate Projected Gated Recurrent Unit for Speech Recognition

  title={Output-Gate Projected Gated Recurrent Unit for Speech Recognition},
  author={Gaofeng Cheng and Daniel Povey and Lu Huang and Ji Xu and S. Khudanpur and Yonghong Yan},
In this paper, we describe the work on accelerating decoding speed while improving the decoding accuracy. Firstly, we propose an architecture which we call Projected Gated Recurrent Unit (PGRU) for automatic speech recognition (ASR) tasks, and show that the PGRU could outperform the standard GRU consistently. Secondly, in order to improve the PGRU’s generalization, especially for large-scale ASR task, the Output-gate PGRU (OPGRU) is proposed. Finally, time delay neural network (TDNN) and… Expand
An Exploration of Recurrent Units for Automatic Speech Recognition with RNN based Acoustic Model
  • H. Zhang
  • Computer Science
  • 2019 2nd International Conference on Information Systems and Computer Aided Education (ICISCAE)
  • 2019
Projected Minimal Gated Recurrent Unit for Speech Recognition
Simplified LSTMS for Speech Recognition
Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-trained Neural Network Acoustic Models
  • Chao Weng, Dong Yu
  • Computer Science, Mathematics
  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation
Voiceai Systems to NIST Sre19 Evaluation: Robust Speaker Recognition on Conversational Telephone Speech
Non-autoregressive Deliberation-Attention based End-to-End ASR


Improving Speech Recognition by Revising Gated Recurrent Units
Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach
  • K. Chen, Qiang Huo
  • Computer Science
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2016
Light Gated Recurrent Units for Speech Recognition
Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs
Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI
Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Achieving Human Parity in Conversational Speech Recognition
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
English Conversational Telephone Speech Recognition by Humans and Machines
Speaker adaptation of neural network acoustic models using i-vectors