On the difficulty of training recurrent neural networks

@inproceedings{Pascanu2013OnTD,
  title={On the difficulty of training recurrent neural networks},
  author={Razvan Pascanu and Tomas Mikolov and Yoshua Bengio},
  booktitle={ICML},
  year={2013}
}
There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geometric and a dynamical systems perspective. Our analysis is used to justify a simple yet effective solution. We propose a gradient norm clipping strategy to deal with exploding gradients and a soft… CONTINUE READING
Highly Influential
This paper has highly influenced 129 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 1,841 citations. REVIEW CITATIONS

7 Figures & Tables

Topics

Statistics

02004006002013201420152016201720182019
Citations per Year

1,842 Citations

Semantic Scholar estimates that this publication has 1,842 citations based on the available data.

See our FAQ for additional information.

Blog posts, news articles and tweet counts and IDs sourced by
Altmetric.com