Stochastic gradient descent (often shortened in SGD), also known as incremental gradient descent, is a stochastic approximation of the gradientâ€¦Â (More)

Semantic Scholar uses AI to extract papers important to this topic.

2017

2017

There is widespread sentiment that fast gradient methods (e.g. Nesterovâ€™s acceleration, conjugate gradient, heavy ball) are notâ€¦Â (More)

Is this relevant?

Highly Cited

2013

Highly Cited

2013

- Rie Johnson, Tong Zhang
- NIPS
- 2013

Stochastic gradient descent is popular for large scale optimization but has slow convergence asymptotically due to the inherentâ€¦Â (More)

Is this relevant?

Highly Cited

2012

Highly Cited

2012

- LÃ©on Bottou
- Neural Networks: Tricks of the Trade
- 2012

Chapter 1 strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of aâ€¦Â (More)

Is this relevant?

Highly Cited

2011

Highly Cited

2011

- Benjamin Recht, Christopher RÃ©, Stephen J. Wright, Feng Niu
- NIPS
- 2011

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve stateof-the-art performance on a variety of machineâ€¦Â (More)

Is this relevant?

Highly Cited

2011

Highly Cited

2011

We provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions ofâ€¦Â (More)

Is this relevant?

Highly Cited

2010

Highly Cited

2010

- Martin Zinkevich, Markus Weimer, Alexander J. Smola, Lihong Li
- NIPS
- 2010

<lb>With the increase in available data parallel machine learning has become an in-<lb>creasingly pressing problem. In this paperâ€¦Â (More)

Is this relevant?

Highly Cited

2009

Highly Cited

2009

- Antoine Bordes, LÃ©on Bottou, Patrick Gallinari
- Journal of Machine Learning Research
- 2009

The SGD-QN algorithm is a stochastic gradient descent algorithm that makes careful use of secondorder information and splits theâ€¦Â (More)

Is this relevant?

Highly Cited

2008

Highly Cited

2008

- Stefan Klein, Josien P. W. Pluim, Marius Staring, Max A. Viergever
- International Journal of Computer Vision
- 2008

We present a stochastic gradient descent optimisation method for image registration with adaptive step size prediction. Theâ€¦Â (More)

Is this relevant?

Highly Cited

2004

Highly Cited

2004

- Tong Zhang
- ICML
- 2004

Linear prediction methods, such as least squares for regression, logistic regression and support vector machines forâ€¦Â (More)

Is this relevant?

Highly Cited

1999

Highly Cited

1999

Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successiveâ€¦Â (More)

Is this relevant?