SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm

  title={SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm},
  author={Imen Chakroun and Tom Haber and Thomas J. Ashby},
Stochastic Gradient Descent (SGD, or 1-SGD in our notation) is probably the most popular family of optimisation algorithms used in machine learning on large data sets due to its ability to optimise efficiently with respect to the number of complete training set data touches (epochs) used. Various authors have worked on data or model parallelism for SGD, but there is little work on how SGD fits with memory hierarchies ubiquitous in HPC machines. Standard practice suggests randomising the order… CONTINUE READING
Recent Discussions
This paper has been referenced on Twitter 1 time over the past 90 days. VIEW TWEETS
1 Citations
4 References
Similar Papers


Publications citing this paper.


Publications referenced by this paper.
Showing 1-4 of 4 references

Pfreundt . Asynchronous parallel stochastic gradient descent : a numeric core for scalable distributed machine learning algorithms

  • F-J. J. Keuper
  • Proceedings of the Workshop on Machine Learning…
  • 2015

Large-Scale Machine Learning with Stochastic Gradient Descent

  • L. Bottou
  • Proceedings of the 19th International Conference…
  • 2010
3 Excerpts

Similar Papers

Loading similar papers…