• Computer Science
  • Published in NIPS 2017

Communication-Efficient Stochastic Gradient Descent, with Applications to Neural Networks

@inproceedings{Alistarh2017CommunicationEfficientSG,
  title={Communication-Efficient Stochastic Gradient Descent, with Applications to Neural Networks},
  author={Dan Alistarh and Demjan Grubic and Jerry Liu and Ryota Tomioka and Milan Vojnovic},
  booktitle={NIPS 2017},
  year={2017}
}
Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to its excellent scalability properties. A fundamental barrier when parallelizing SGD is the high bandwidth cost of communicating gradient updates between nodes; consequently, several lossy compresion heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always guarantee convergence, and it is not… CONTINUE READING

Topics from this paper.

Citations

Publications citing this paper.