Learning to Forget: Continual Prediction with LSTM

@article{Gers2000LearningTF,
  title={Learning to Forget: Continual Prediction with LSTM},
  author={Felix A. Gers and J{\"u}rgen Schmidhuber and Fred A. Cummins},
  journal={Neural Computation},
  year={2000},
  volume={12},
  pages={2451-2471}
}
Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel… CONTINUE READING

Similar Papers

Loading similar papers…