Memory-Efficient Backpropagation Through Time
@inproceedings{Gruslys2016MemoryEfficientBT, title={Memory-Efficient Backpropagation Through Time}, author={A. Gruslys and R. Munos and Ivo Danihelka and Marc Lanctot and A. Graves}, booktitle={NIPS}, year={2016} }
We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and… Expand
Supplemental Code
Github Repo
Via Papers with Code
Make huge neural nets fit in memory
Paper Mentions
100 Citations
Memory-Efficient Backpropagation for Recurrent Neural Networks
- Computer Science
- Canadian Conference on AI
- 2019
Backpropagation for long sequences: beyond memory constraints with constant overheads
- Computer Science
- ArXiv
- 2018
- 2
- PDF
A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation
- Computer Science, Mathematics
- NeurIPS
- 2019
- 8
- PDF
A Practical Sparse Approximation for Real Time Recurrent Learning
- Computer Science, Mathematics
- ArXiv
- 2020
- 3
- PDF
Backprop with Approximate Activations for Memory-efficient Network Training
- Computer Science, Mathematics
- NeurIPS
- 2019
- 4
- PDF
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery
- Computer Science, Mathematics
- ArXiv
- 2018
- 2
- PDF
Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training
- Computer Science
- Euro-Par
- 2020
- 1
- PDF
Adaptively Truncating Backpropagation Through Time to Control Gradient Bias
- Computer Science, Mathematics
- UAI
- 2019
- 4
- PDF
Optimal memory-aware backpropagation of deep join networks
- Computer Science, Medicine
- Philosophical Transactions of the Royal Society A
- 2020
- 6
- PDF
References
SHOWING 1-10 OF 18 REFERENCES
Hybrid computing using a neural network with dynamic external memory
- Computer Science, Medicine
- Nature
- 2016
- 966
- PDF
Speech recognition with deep recurrent neural networks
- Computer Science
- 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013
- 5,784
- PDF