• Corpus ID: 18939716

LSTM Neural Networks for Language Modeling

@inproceedings{Sundermeyer2012LSTMNN,
  title={LSTM Neural Networks for Language Modeling},
  author={Martin Sundermeyer and Ralf Schl{\"u}ter and Hermann Ney},
  booktitle={INTERSPEECH},
  year={2012}
}
Neural networks have become increasingly popular for the task of language modeling. [...] Key Result In addition, we gain considerable improvements in WER on top of a state-of-the-art speech recognition system.Expand
Comparison of feedforward and recurrent neural network language models
TLDR
A simple and efficient method to normalize language model probabilities across different vocabularies is proposed, and it is shown how to speed up training of recurrent neural networks by parallelization.
Sequential Recurrent Neural Networks for Language Modeling
TLDR
An architecture is proposed which takes advantage of the explicit, sequential enumeration of the word history in FNN structure while enhancing each word representation at the projection layer through recurrent context information that evolves in the network.
From Feedforward to Recurrent LSTM Neural Networks for Language Modeling
TLDR
This paper compares count models to feedforward, recurrent, and long short-term memory (LSTM) neural network variants on two large-vocabulary speech recognition tasks, and analyzes the potential improvements that can be obtained when applying advanced algorithms to the rescoring of word lattices on large-scale setups.
Comparison of Various Neural Network Language Models in Speech Recognition
  • Lingyun Zuo, Jian Liu, Xin Wan
  • Computer Science
    2016 3rd International Conference on Information Science and Control Engineering (ICISCE)
  • 2016
TLDR
This paper compares count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks, and puts forward a language model estimation method that introduced the information of history sentences.
An improved recurrent neural network language model with context vector features
  • J. Zhang, Dan Qu, Zhen Li
  • Computer Science
    2014 IEEE 5th International Conference on Software Engineering and Service Science
  • 2014
TLDR
This paper improves the model performance by providing contextual word vectors in association with RNNLMs, which can reinforce the ability of learning long-distance information using vectors training from Skip-gram model.
An Algorithm for Text Prediction Using Neural Networks
TLDR
An algorithm using machine learning which, when given a dataset of conversations, is able to train itself using Neural Networks which can be used to get suggestions for replies for any particular input sentence is proposed.
Application of LSTM Neural Networks in Language Modelling
TLDR
Recurrent neural networks can take advantage of all previous words to predict the next word, due to the difficulties in training of RNN, the way could be in using Long Short Term Memory (LSTM) neural network architecture.
Temporal kernel neural network language model
TLDR
This paper proposes the temporal kernel neural network language model, a variant of models mentioned above that explicitly captures long-term dependencies of words with exponential kernel, where the memory of history is decayed exponentially.
Recurrent neural network language model with structured word embeddings for speech recognition
TLDR
Experiments showed that the proposed approach effectively outperformed the standard RNNLM, yielding a relative PPL improvement of 8:8% and an absolute 0:59% CER improvement in N-Best re-scoring.
On Continuous Space Word Representations as Input of LSTM Language Model
TLDR
The continuous skip-gram and the continuous bag of words CBOW are algorithms for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationships and all are used on the input of LSTM network.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 19 REFERENCES
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Performance analysis of Neural Networks in combination with n-gram language models
TLDR
Which types of events are better modeled by NNLMs as compared to n-gram LMs, in what cases improvements are most substantial and why this is the case are explained.
Hierarchical Probabilistic Neural Network Language Model
TLDR
A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.
Extensions of recurrent neural network language model
TLDR
Several modifications of the original recurrent neural network language model are presented, showing approaches that lead to more than 15 times speedup for both training and testing phases and possibilities how to reduce the amount of parameters in the model.
Training Continuous Space Language Models: Some Practical Issues
TLDR
This work studies the performance and behavior of two neural statistical language models so as to highlight some important caveats of the classical training algorithms, and introduces a new initialization scheme and new training techniques to greatly reduce the training time and to significantly improve performance.
Continuous space language models
TLDR
Highly efficient learning algorithms are described that enable the use of training corpora of several hundred million words and it is shown that this approach can be incorporated into a large vocabulary continuous speech recognizer using a lattice rescoring framework at a very low additional processing time.
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
TLDR
The main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs).
Learning to Forget: Continual Prediction with LSTM
TLDR
This work identifies a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset, and proposes a novel, adaptive forget gate that enables an LSTm cell to learn to reset itself at appropriate times, thus releasing internal resources.
RNNLM - Recurrent Neural Network Language Modeling Toolkit
We present a freely available open-source toolkit for training recurrent neural network based language models. It can be easily used to improve existing speech recognition and machine translation
Learning long-term dependencies with gradient descent is difficult
TLDR
This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.
...
1
2
...