Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

  title={Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization},
  author={Xiangru Lian and Yijun Huang and Yuncheng Li and Ji Liu},
Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provide theoretical supports, this paper studies two asynchronous parallel implementations of SG: one is… CONTINUE READING
Highly Influential
This paper has highly influenced 24 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 168 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 27 times. VIEW TWEETS


Publications citing this paper.
Showing 1-10 of 119 extracted citations

Distributed deep learning on edge-devices: Feasibility via adaptive compression

2017 IEEE 16th International Symposium on Network Computing and Applications (NCA) • 2017
View 13 Excerpts
Highly Influenced

169 Citations

Citations per Year
Semantic Scholar estimates that this publication has 169 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 35 references

Caffe: Convolutional Architecture for Fast Feature Embedding

ACM Multimedia • 2014
View 5 Excerpts
Highly Influenced

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

SIAM Journal on Optimization • 2013
View 4 Excerpts
Highly Influenced

Distributed delayed stochastic optimization

2012 IEEE 51st IEEE Conference on Decision and Control (CDC) • 2011
View 8 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…