Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies

@inproceedings{Kolen2001GradientFI,
  title={Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies},
  author={J. Kolen and S. C. Kremer},
  year={2001}
}
This chapter contains sections titled: Introduction Exponential Error Decay Dilemma: Avoiding Aradient Decay Prevents Long-Term Latching Remedies Conclusion ]]> 
164 Citations

Figures from this paper

Attempting to reduce the vanishing gradient effect through a novel recurrent multiscale architecture
  • 7
Low-Cost Recurrent Neural Network Expected Performance Evaluation
  • 13
  • PDF
Time-series forecasting with deep learning: a survey
  • Bryan Lim, S. Zohren
  • Mathematics, Computer Science
  • Philosophical Transactions of the Royal Society A
  • 2021
  • 16
  • PDF
A review on the long short-term memory model
  • 6
Recurrent collective classification
  • 8
  • PDF
Extract, Attend, Predict: Aspect-Based Sentiment Analysis with Deep Self-Attention Network
  • Yiwei Lv, Minghao Hu, C. Yang, Yuanyan Tang, H. Wang
  • Computer Science
  • 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
  • 2019
  • 2
Spectral Analysis and Stability of Deep Neural Dynamics
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
Learning internal representations by error propagation
  • 18,432
  • PDF
Bifurcations in the learning of recurrent neural networks
  • K. Doya
  • Computer Science
  • [Proceedings] 1992 IEEE International Symposium on Circuits and Systems
  • 1992
  • 133
Credit Assignment through Time: Alternatives to Backpropagation
  • 48
  • PDF
Learning long-term dependencies with gradient descent is difficult
  • 5,024
  • Highly Influential
  • PDF
LSTM can Solve Hard Long Time Lag Problems
  • 418
  • PDF
How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
  • 74
  • PDF
Learning long-term dependencies in NARX recurrent neural networks
  • 550
  • PDF
Learning Complex, Extended Sequences Using the Principle of History Compression
  • 378
A time-delay neural network architecture for isolated word recognition
  • 619
  • PDF
...
1
2
3
...