Learning to Forget: Continual Prediction with LSTM

@article{Gers2000LearningTF,
  title={Learning to Forget: Continual Prediction with LSTM},
  author={F. Gers and J. Schmidhuber and Fred A. Cummins},
  journal={Neural Computation},
  year={2000},
  volume={12},
  pages={2451-2471}
}
  • F. Gers, J. Schmidhuber, Fred A. Cummins
  • Published 2000
  • Mathematics, Computer Science, Medicine
  • Neural Computation
  • Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel… CONTINUE READING
    Learning Precise Timing with LSTM Recurrent Networks
    934
    A generalized LSTM-like training algorithm for second-order recurrent neural networks
    52
    A review on the long short-term memory model
    Learning compact recurrent neural networks
    63
    Gated Orthogonal Recurrent Units: On Learning to Forget
    55
    Training Recurrent Networks by Evolino
    223
    Radically Simplifying Gated Recurrent Architectures Without Loss of Performance
    Language Modeling through Long-Term Memory Network
    5

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 57 REFERENCES
    Learning to Forget: Continual Prediction with Lstm Learning to Forget: Continual Prediction with Lstm
    11
    Long Short-Term Memory
    29517
    Encoding sequential structure: experience with the real-time recurrent learning algorithm
    40
    Gradient calculations for dynamic recurrent neural networks: a survey
    533
    The Recurrent Cascade-Correlation Architecture
    163
    Finite State Automata and Simple Recurrent Networks
    404