Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning

@article{Zheng2016AsynchronousSG,
  title={Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning},
  author={Shuxin Zheng and Qi Meng and Taifeng Wang and Wei Chen and Nenghai Yu and Zhiming Ma and Tie-Yan Liu},
  journal={CoRR},
  year={2016},
  volume={abs/1609.08326}
}
With the fast development of deep learning, people have started to train very big neural networks using massive data. Asynchronous Stochastic Gradient Descent (ASGD) is widely used to fulfill this task, which, however, is known to suffer from the problem of delayed gradient. That is, when a local worker adds the gradient it calculates to the global model, the global model may have been updated by other workers and this gradient becomes “delayed”. We propose a novel technology to compensate this… CONTINUE READING
Recent Discussions
This paper has been referenced on Twitter 9 times over the past 90 days. VIEW TWEETS
9 Citations
21 References
Similar Papers

References

Publications referenced by this paper.
Showing 1-10 of 21 references

The elements of statistical learning

  • J. Friedman, T. Hastie, R. Tibshirani
  • Springer series in statistics Springer, Berlin.
  • 2001
Highly Influential
4 Excerpts

Similar Papers

Loading similar papers…