Rapid bayesian learning for recurrent neural network language model

Abstract

This paper presents Bayesian learning for recurrent neural network language model (RNN-LM). Our goal is to regularize the RNN-LM by compensating for the randomness of the estimated model parameters which is characterized by a Gaussian prior. This model is not only constructed by training the synaptic weight parameters according to the maximum a posteriori criterion but also regularized by estimating the Gaussian hyper-parameter through the type 2 maximum likelihood. However, a critical issue in Bayesian RNN-LM is the heavy computation of Hessian matrix which is formed as the sum of a large amount of outer-products of high-dimensional gradient vectors. We present a rapid approximation to reduce the redundancy due to the curse of dimensionality and speed up the calculation by summing up only the salient outer-products. Experiments on 1B-Word Benchmark, Penn Treebank and World Street Journal corpora show that rapid Bayesian RNN-LM consistently improves the perplexity and word error rate in comparison with standard RNN-LM.

DOI: 10.1109/ISCSLP.2014.6936640

4 Figures and Tables

Cite this paper

@article{Chien2014RapidBL, title={Rapid bayesian learning for recurrent neural network language model}, author={Jen-Tzung Chien and Yuan-Chu Ku and Mou-Yue Huang}, journal={The 9th International Symposium on Chinese Spoken Language Processing}, year={2014}, pages={34-38} }