Corpus ID: 53718921

Image Classification at Supercomputer Scale

@article{Ying2018ImageCA,
  title={Image Classification at Supercomputer Scale},
  author={Chris Ying and Sameer Kumar and Dehao Chen and Tao Wang and Youlong Cheng},
  journal={ArXiv},
  year={2018},
  volume={abs/1811.06992}
}
  • Chris Ying, Sameer Kumar, +2 authors Youlong Cheng
  • Published 2018
  • Mathematics, Computer Science
  • ArXiv
  • Deep learning is extremely computationally intensive, and hardware vendors have responded by building faster accelerators in large clusters. Training deep learning models at petaFLOPS scale requires overcoming both algorithmic and systems software challenges. In this paper, we discuss three systems-related optimizations: (1) distributed batch normalization to control per-replica batch sizes, (2) input pipeline optimizations to sustain model throughput, and (3) 2-D torus all-reduce to speed up… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 59 CITATIONS

    Auto-Precision Scaling for Distributed Deep Learning

    VIEW 2 EXCERPTS
    CITES BACKGROUND & METHODS

    Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks

    VIEW 1 EXCERPT

    Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models

    VIEW 1 EXCERPT
    CITES METHODS

    FILTER CITATIONS BY YEAR

    2018
    2020

    CITATION STATISTICS

    • 9 Highly Influenced Citations

    • Averaged 20 Citations per year from 2018 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 18 REFERENCES

    In-datacenter performance analysis of a tensor processing unit

    VIEW 1 EXCERPT

    Large Scale Distributed Deep Networks

    VIEW 1 EXCERPT

    ImageNet Large Scale Visual Recognition Challenge

    Deep Residual Learning for Image Recognition

    VIEW 13 EXCERPTS
    HIGHLY INFLUENTIAL