Recent advances in deep learning for speech research at Microsoft

@article{Deng2013RecentAI,
  title={Recent advances in deep learning for speech research at Microsoft},
  author={Li Deng and Jinyu Li and Jui-Ting Huang and Kaisheng Yao and Dong Yu and Frank Seide and Michael L. Seltzer and Geoffrey Zweig and Xiaodong He and Jason D. Williams and Yifan Gong and Alex Acero},
  journal={2013 IEEE International Conference on Acoustics, Speech and Signal Processing},
  year={2013},
  pages={8604-8608}
}
Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected… CONTINUE READING

Figures, Tables, Results, and Topics from this paper.

Key Quantitative Results

  • Shown in Table 6, on a large vocabulary speech recognition task, a SGD implementation of the fDLR and the top softmax layer adaptation is shown to reduce word errors by 17% and 14%, respectively, compared to the baseline DNN performance.

Citations

Publications citing this paper.
SHOWING 1-10 OF 274 CITATIONS

THCHS-30 : A Free Chinese Speech Corpus

VIEW 5 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Unsupervised adaptation with domain separation networks for robust speech recognition

  • 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
  • 2017
VIEW 11 EXCERPTS
CITES BACKGROUND

End-to-End Speech Recognition Models

VIEW 7 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Speech recognition with temporal neural networks

VIEW 8 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Head motion synthesis from speech using deep neural networks

  • Multimedia Tools and Applications
  • 2014
VIEW 4 EXCERPTS
CITES BACKGROUND & RESULTS
HIGHLY INFLUENCED

Speech-driven head motion synthesis using neural networks

VIEW 9 EXCERPTS
CITES BACKGROUND & RESULTS
HIGHLY INFLUENCED

Speech recognition using deep neural network - recent trends

VIEW 3 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2013
2019

CITATION STATISTICS

  • 17 Highly Influenced Citations

  • Averaged 43 Citations per year from 2017 through 2019

References

Publications referenced by this paper.
SHOWING 1-10 OF 58 REFERENCES

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition

  • 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

An investigation of deep neural networks for noise robust speech recognition

  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013

Deep convolutional neural networks for LVCSR

  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013

Deep stacking networks for information retrieval

  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013

Error back propagation for sequence training of Context-Dependent Deep NetworkS for conversational speech transcription

  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013
VIEW 1 EXCERPT

KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition

  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013
VIEW 1 EXCERPT

Machine Learning Paradigms for Speech Recognition: An Overview

  • IEEE Transactions on Audio, Speech, and Language Processing
  • 2013
VIEW 1 EXCERPT