CrossBow: Scaling Deep Learning on Multi-GPU Servers

@inproceedings{Koliousis2018CrossBowSD,
  title={CrossBow: Scaling Deep Learning on Multi-GPU Servers},
  author={Alexandros Koliousis and Pijika Watcharapichat and Matthias Weidlich and Paolo Costa and Peter R. Pietzuch},
  year={2018}
}
With the widespread availability of servers with 4 or more GPUs, scalability in terms of the number of GPUs in a server when training deep learning models becomes a paramount concern. Systems such as TensorFlow and MXNet train using synchronous stochastic gradient descent—an input batch is partitioned across the GPUs, each computing a partial gradient. The gradients are then combined to update the model parameters before proceeding to the next batch. For many deep learning models, this… CONTINUE READING

Similar Papers

Loading similar papers…