Learning to Forget: Continual Prediction with LSTM

@article{Gers2000LearningTF,
  title={Learning to Forget: Continual Prediction with LSTM},
  author={F. Gers and J. Schmidhuber and F. Cummins},
  journal={Neural Computation},
  year={2000},
  volume={12},
  pages={2451-2471}
}
Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel… Expand
Learning Precise Timing with LSTM Recurrent Networks
A generalized LSTM-like training algorithm for second-order recurrent neural networks
Learning compact recurrent neural networks
A review on the long short-term memory model
Gated Orthogonal Recurrent Units: On Learning to Forget
Training Recurrent Networks by Evolino
Radically Simplifying Gated Recurrent Architectures Without Loss of Performance
  • J. Boardman, Y. Xie
  • Computer Science
  • 2019 IEEE International Conference on Big Data (Big Data)
  • 2019
The Performance of LSTM and BiLSTM in Forecasting Time Series
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 60 REFERENCES
Long Short-Term Memory
LSTM recurrent networks learn simple context-free and context-sensitive languages
Encoding sequential structure: experience with the real-time recurrent learning algorithm
Learning long-term dependencies with gradient descent is difficult
Learning long-term dependencies in NARX recurrent neural networks
Gradient calculations for dynamic recurrent neural networks: a survey
The Recurrent Cascade-Correlation Architecture
Finite State Automata and Simple Recurrent Networks
...
1
2
3
4
5
...