MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning

@article{Mamidala2018MXNETMPIEM,
  title={MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning},
  author={Amith R. Mamidala and Georgios Kollias and Chris Ward and Fausto Artico},
  journal={CoRR},
  year={2018},
  volume={abs/1801.03855}
}
Existing Deep Learning frameworks exclusively use either Parameter Server(PS) approach or MPI parallelism. In this paper, we discuss the drawbacks of such approaches and propose a generic framework supporting both PS and MPI programming paradigms, co-existing at the same time. The key advantage of the new model is to embed the scaling benefits of MPI parallelism into the loosely coupled PS task model. Apart from providing a practical usage model of MPI in cloud, such framework allows for novel… CONTINUE READING