Long Short-Term Memory

  title={Long Short-Term Memory},
  author={Sepp Hochreiter and J{\"u}rgen Schmidhuber},
  journal={Neural Computation},
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant… CONTINUE READING
Highly Influential
This paper has highly influenced 2,469 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 20,109 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 2 times. VIEW TWEETS


Publications citing this paper.
Showing 1-10 of 10,806 extracted citations

RF Sensing in the Internet of Things: A General Deep Learning Framework

IEEE Communications Magazine • 2018
View 15 Excerpts
Method Support
Highly Influenced

Towards End-to-end Spoken Language Understanding

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) • 2018
View 8 Excerpts
Method Support
Highly Influenced

Neural Associative Memory for Dual-Sequence Modeling

Rep4NLP@ACL • 2016
View 11 Excerpts
Highly Influenced

Structured Label Inference for Visual Understanding

IEEE transactions on pattern analysis and machine intelligence • 2019
View 7 Excerpts
Method Support
Highly Influenced

4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition • 2018
View 7 Excerpts
Method Support
Highly Influenced

20,109 Citations

Citations per Year
Semantic Scholar estimates that this publication has 20,109 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 37 references

Guessing can outperform many long time lag algorithms (Tech

J. Schmidhuber, S. Hochreiter
Rep. No. IDSIA-19-96). Lugano, Switzerland: Instituto Dalle Molle di Studi sull’Intelligenza Artificiale • 1996
View 14 Excerpts
Highly Influenced

Induction of Multiscale Temporal Structure

NIPS • 1991
View 7 Excerpts
Highly Influenced

Untersuchungen zu dynamischen neuronalen Netzen

J. Hochreiter
Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München. See http://www7.informatik.tu-muenchen.de/ ̃hochreit. • 1991
View 11 Excerpts
Highly Influenced