Deep Neural Network Language Models
@inproceedings{Arisoy2012DeepNN, title={Deep Neural Network Language Models}, author={Ebru Arisoy and Tara N. Sainath and Brian Kingsbury and Bhuvana Ramabhadran}, booktitle={WLM@NAACL-HLT}, year={2012} }
In recent years, neural network language models (NNLMs) have shown success in both peplexity and word error rate (WER) compared to conventional n-gram language models. [] Key Result Furthermore, our preliminary results are competitive with a model M language model, considered to be one of the current state-of-the-art techniques for language modeling.
181 Citations
RECURRENT NEURAL NETWORK LANGUAGE MODEL WITH VECTOR-SPACE WORD REPRESENTATIONS
- Computer Science
- 2014
The vector space word representations (word vector), which can capture syntactic and semantic regularities of language, is used as additional features in order to enhance RNNLM.
Comparison of Various Neural Network Language Models in Speech Recognition
- Computer Science2016 3rd International Conference on Information Science and Control Engineering (ICISCE)
- 2016
This paper compares count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks, and puts forward a language model estimation method that introduced the information of history sentences.
Factored Language Model based on Recurrent Neural Network
- Computer ScienceCOLING
- 2012
This study extends RNNLM by explicitly integrating additional linguistic information, including morphological, syntactic, or semantic factors, that is expected to enhance RNNLMs.
Scalable Recurrent Neural Network Language Models for Speech Recognition
- Computer Science
- 2015
This thesis aims to further explore recurrent neural network in the application of automatic speech recognition from the aspect of language models and investigates the integration of metadata in RNNLM for ASR.
From Feedforward to Recurrent LSTM Neural Networks for Language Modeling
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2015
This paper compares count models to feedforward, recurrent, and long short-term memory (LSTM) neural network variants on two large-vocabulary speech recognition tasks, and analyzes the potential improvements that can be obtained when applying advanced algorithms to the rescoring of word lattices on large-scale setups.
Bag-of-words input for long history representation in neural network-based language models for speech recognition
- Computer ScienceINTERSPEECH
- 2015
This paper investigates an alternative encoding of the word history, known as bag-of-words (BOW) representation of a word sequence, and uses it as an additional input feature to the NNLM, and shows that the BOW features significantly improve both the perplexity (PP) and the word error rate (WER) of a standard FFNN LM.
Recurrent neural networks for language understanding
- Computer ScienceINTERSPEECH
- 2013
This paper modify the architecture to perform Language Understanding, and advance the state-of-the-art for the widely used ATIS dataset.
Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition
- Computer ScienceINTERSPEECH
- 2018
It is shown that deep 4-gram LSTM can significantly outperform large interpolated count models by performing the backing off and smoothing significantly better and underlines the decreasing importance to combine state-of-the-art deep NNLM with count based model.
Comparison of feedforward and recurrent neural network language models
- Computer Science2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013
A simple and efficient method to normalize language model probabilities across different vocabularies is proposed, and it is shown how to speed up training of recurrent neural networks by parallelization.
Efficient training strategies for deep neural network language models.
- Computer ScienceNIPS 2014
- 2014
An extensive study on the best-practice to train large neural network language models on a corpus of more than 5.5 billion words and an important finding is that deep architectures systematically achieve better translation quality than shallow ones.
References
SHOWING 1-10 OF 27 REFERENCES
Recurrent neural network based language model
- Computer ScienceINTERSPEECH
- 2010
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Structured Output Layer neural network language model
- Computer Science2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2011
A new neural network language model (NNLM) based on word clustering to structure the output vocabulary: Structured Output Layer NNLM, able to handle vocabularies of arbitrary size, hence dispensing with the design of short-lists that are commonly used in NNLMs.
Training Neural Network Language Models on Very Large Corpora
- Computer ScienceHLT
- 2005
New algorithms to train a neural network language model on very large text corpora are presented, making possible the use of the approach in domains where several hundreds of millions words of texts are available.
Hierarchical Probabilistic Neural Network Language Model
- Computer ScienceAISTATS
- 2005
A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.
Deep Belief Networks for phone recognition
- Computer Science
- 2009
Deep Belief Networks (DBNs) have recently proved to be very effective in a variety of machine learning problems and this paper applies DBNs to acous ti modeling.
Making Deep Belief Networks effective for large vocabulary continuous speech recognition
- Computer Science2011 IEEE Workshop on Automatic Speech Recognition & Understanding
- 2011
This paper explores the performance of DBNs in a state-of-the-art LVCSR system, showing improvements over Multi-Layer Perceptrons (MLPs) and GMM/HMMs across a variety of features on an English Broadcast News task.
Strategies for training large scale neural network language models
- Computer Science2011 IEEE Workshop on Automatic Speech Recognition & Understanding
- 2011
This work describes how to effectively train neural network based language models on large data sets and introduces hash-based implementation of a maximum entropy model, that can be trained as a part of the neural network model.
A Scalable Hierarchical Distributed Language Model
- Computer ScienceNIPS
- 2008
A fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data are introduced and it is shown that the resulting models can outperform non-hierarchical neural models as well as the best n-gram models.
Extensions of recurrent neural network language model
- Computer Science2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2011
Several modifications of the original recurrent neural network language model are presented, showing approaches that lead to more than 15 times speedup for both training and testing phases and possibilities how to reduce the amount of parameters in the model.