• Publications
  • Influence
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
TLDR
This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
Deep Neural Networks for Acoustic Modeling in Speech Recognition
TLDR
This paper provides an overview of this progress and repres nts the shared views of four research groups who have had recent successes in using deep neural networks for a coustic modeling in speech recognition.
Boosted MMI for model and feature-space discriminative training
TLDR
A modified form of the maximum mutual information (MMI) objective function which gives improved results for discriminative training by boosting the likelihoods of paths in the denominator lattice that have a higher phone error relative to the correct transcript.
Deep convolutional neural networks for LVCSR
TLDR
This paper determines the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks, and explores the behavior of neural network features extracted from CNNs on a variety of LVCSS tasks, comparing CNNs toDNNs and GMMs.
New types of deep neural network learning for speech recognition and related applications: an overview
TLDR
An overview of the invited and contributed papers presented at the special session at ICASSP-2013, entitled “New Types of Deep Neural Network Learning for Speech Recognition and Related Applications,” as organized by the authors is provided.
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
  • Brian Kingsbury
  • Computer Science
    IEEE International Conference on Acoustics…
  • 19 April 2009
TLDR
This paper demonstrates that neural-network acoustic models can be trained with sequence classification criteria using exactly the same lattice-based methods that have been developed for Gaussian mixture HMMs, and that using a sequence classification criterion in training leads to considerably better performance.
fMPE: discriminatively trained features for speech recognition
MPE (minimum phone error) is a previously introduced technique for discriminative training of HMM parameters. fMPE applies the same objective function to the features, transforming the data with a
Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets
TLDR
A low-rank matrix factorization of the final weight layer is proposed and applied to DNNs for both acoustic modeling and language modeling, showing an equivalent reduction in training time and a significant loss in final recognition accuracy compared to a full-rank representation.
...
...