Corpus ID: 7743432

End-to-End Deep Neural Network for Automatic Speech Recognition

@inproceedings{Song2015EndtoEndDN,
  title={End-to-End Deep Neural Network for Automatic Speech Recognition},
  author={William Song},
  year={2015}
}
  • William Song
  • Published 2015
  • We investigate the efficacy of deep neural networks on speech recognition. Specifically, we implement an end-to-end deep learning system that utilizes mel-filter bank features to directly output to spoken phonemes without the need of a traditional Hidden Markov Model for decoding. The system will comprise of two variants of neural networks for phoneme recognition. In particular, we utilize convolutional for frame level classification and recurrent architecture with Connectionist Temporal… CONTINUE READING
    30 Citations

    Figures and Tables from this paper

    Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
    • 249
    • PDF
    Deep Neural Networks for Acoustic Modeling in the Presence of Noise
    • 4
    • PDF
    An Overview of End-to-End Automatic Speech Recognition
    • 14
    • PDF
    CNN-Self-Attention-DNN Architecture For Mandarin Recognition
    • Chengtao Cai, D. Guo
    • Computer Science
    • 2020 Chinese Control And Decision Conference (CCDC)
    • 2020
    Improving Automatic Speech Recognition Utilizing Audio-codecs for Data Augmentation
    A comparative study of state-of-the-art speech recognition models for English and Dutch
    • PDF
    Dual supervised learning for non-native speech recognition
    • 7

    References

    SHOWING 1-10 OF 16 REFERENCES
    Towards End-To-End Speech Recognition with Recurrent Neural Networks
    • 1,495
    • PDF
    Speech recognition with deep recurrent neural networks
    • 5,689
    • PDF
    Hybrid speech recognition with Deep Bidirectional LSTM
    • 1,069
    • PDF
    Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
    • 6,378
    • PDF
    Deep convolutional neural networks for LVCSR
    • 860
    • PDF
    Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks
    • 2,395
    • Highly Influential
    • PDF
    Recent advances in deep learning for speech research at Microsoft
    • L. Deng, J. Li, +9 authors A. Acero
    • Computer Science
    • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
    • 2013
    • 599
    • PDF
    Framewise phoneme classification with bidirectional LSTM and other neural network architectures
    • 2,476
    • PDF
    Connectionist Speech Recognition: A Hybrid Approach
    • 1,356
    Sequence Transduction with Recurrent Neural Networks
    • A. Graves
    • Computer Science, Mathematics
    • ArXiv
    • 2012
    • 605
    • PDF