• Publications
  • Influence
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
DNNs for acoustic modeling in speech recognition can outperform Gaussian mixture models on speech recognition benchmarks. Expand
  • 6,532
  • 250
  • PDF
Deep Neural Networks for Acoustic Modeling in Speech Recognition
Deep neural networks with many hidden layers, that are trained using new methods have been shown to outperform Gaussian mixture models on a variety of speech rec ognition benchmarks, sometimes by a large margin. Expand
  • 2,104
  • 149
  • PDF
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
We propose a novel context-dependent (CD) model for large-vocabulary speech recognition (LVSR) that leverages recent advances in using deep belief networks for phone recognition. Expand
  • 2,490
  • 132
  • PDF
Deep Learning: Methods and Applications
  • L. Deng, Dong Yu
  • Computer Science
  • Found. Trends Signal Process.
  • 12 June 2014
This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. Expand
  • 1,860
  • 119
  • PDF
Permutation invariant training of deep models for speaker-independent multi-talker speech separation
We propose a novel deep learning training criterion, named permutation invariant training (PIT), for speaker independent multi-talker speech separation, commonly known as the cocktail-party problem. Expand
  • 379
  • 72
  • PDF
Convolutional Neural Networks for Speech Recognition
We show that convolutional neural networks can reduce the error rate by 6%-10% compared with DNNs on the TIMIT phone recognition and the voice search large vocabulary speech recognition tasks. Expand
  • 1,271
  • 61
  • PDF
1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs
We show empirically that in SGD training of deep neural networks, one can, at no or nearly no loss of accuracy, quantize the gradients aggressively—to but one bit per value—if the quantization error is carried forward across minibatches (error feedback). Expand
  • 511
  • 59
  • PDF
Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks
In this paper, we propose the utterance-level permutation invariant training (uPIT) technique. uPIT is a practically applicable, end-to-end, deep-learning-based solution for speaker independentExpand
  • 315
  • 58
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription
We investigate the potential of Context-Dependent Deep-Neural-Network HMMs, or CD-DNN-HMMs, from a feature-engineering perspective. Expand
  • 637
  • 49
  • PDF
An introduction to computational networks and the computational network toolkit (invited talk)
We introduce computational network (CN), a unified framework for describing arbitrary learning machines, such as deep neural networks, convolutional neural network (CNNs), recurrent neural networks (RNNs), long short term memory (LSTM), logistic regression, and maximum entropy model, that can be illustrated as a series of computational steps. Expand
  • 365
  • 48
  • PDF