Corpus ID: 1101453

Recurrent Highway Networks

@article{Zilly2017RecurrentHN,
  title={Recurrent Highway Networks},
  author={Julian G. Zilly and R. Srivastava and J. Koutn{\'i}k and J. Schmidhuber},
  journal={ArXiv},
  year={2017},
  volume={abs/1607.03474}
}
  • Julian G. Zilly, R. Srivastava, +1 author J. Schmidhuber
  • Published 2017
  • Computer Science, Mathematics
  • ArXiv
  • Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose… CONTINUE READING
    312 Citations
    Fast-Slow Recurrent Neural Networks
    • 54
    • Highly Influenced
    • PDF
    Recurrent Highway Networks With Grouped Auxiliary Memory
    • 5
    • Highly Influenced
    Residual Recurrent Highway Networks for Learning Deep Sequence Prediction Models
    • 8
    • Highly Influenced
    Neural Machine Translation with Recurrent Highway Networks
    • 1
    • PDF
    Character-Level Language Modeling with Recurrent Highway Hypernetworks
    • 4
    • PDF
    Highway State Gating for Recurrent Highway Networks: Improving Information Flow Through Time
    • PDF
    From Nodes to Networks: Evolving Recurrent Neural Networks
    • 35
    • PDF
    Highway-LSTM and Recurrent Highway Networks for Speech Recognition
    • 21
    • Highly Influenced
    • PDF
    Regularizing and Optimizing LSTM Language Models
    • 651
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 75 REFERENCES
    Multiplicative LSTM for sequence modelling
    • 105
    • PDF
    Highway long short-term memory RNNS for distant speech recognition
    • 225
    • PDF
    Context dependent recurrent neural network language model
    • Tomas Mikolov, G. Zweig
    • Computer Science
    • 2012 IEEE Spoken Language Technology Workshop (SLT)
    • 2012
    • 477
    • PDF
    Exploring the Limits of Language Modeling
    • 779
    • PDF
    An Empirical Exploration of Recurrent Network Architectures
    • 1,145
    • PDF
    Grid Long Short-Term Memory
    • 276
    • PDF
    Adaptive Computation Time for Recurrent Neural Networks
    • 224
    • PDF
    LSTM: A Search Space Odyssey
    • 2,334
    • PDF
    Gated Feedback Recurrent Neural Networks
    • 534
    • PDF