Corpus ID: 129945164

Low-Memory Neural Network Training: A Technical Report

@article{Sohoni2019LowMemoryNN,
  title={Low-Memory Neural Network Training: A Technical Report},
  author={Nimit Sharad Sohoni and Christopher Richard Aberger and Megan Leszczynski and Jian Zhang and Christopher R{\'e}},
  journal={ArXiv},
  year={2019},
  volume={abs/1904.10631}
}
  • Nimit Sharad Sohoni, Christopher Richard Aberger, +2 authors Christopher Ré
  • Published 2019
  • Computer Science, Mathematics
  • ArXiv
  • Memory is increasingly often the bottleneck when training neural network models. Despite this, techniques to lower the overall memory requirements of training have been less widely studied compared to the extensive literature on reducing the memory requirements of inference. In this paper we study a fundamental question: How much memory is actually needed to train a neural network? To answer this question, we profile the overall memory usage of training on two representative deep learning… CONTINUE READING

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 14 CITATIONS

    Reformer: The Efficient Transformer

    VIEW 2 EXCERPTS
    CITES METHODS

    FICIENT TRAINING OF DEEP NETWORKS

    On the Downstream Performance of Compressed Word Embeddings

    VIEW 1 EXCERPT
    CITES METHODS

    Semantics of the Unwritten

    VIEW 1 EXCERPT
    CITES METHODS

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 91 REFERENCES

    Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

    Gist: Efficient Data Encoding for Deep Neural Network Training

    VIEW 1 EXCERPT

    Mixed Precision Training

    VIEW 4 EXCERPTS
    HIGHLY INFLUENTIAL

    In-place Activated BatchNorm for Memory-Optimized Training of DNNs

    VIEW 2 EXCERPTS