Stochastic gradient descent

Known as: Adagrad, Gradient descent in machine learning, Incremental gradient descent 
Stochastic gradient descent (often shortened in SGD), also known as incremental gradient descent, is a stochastic approximation of the gradient… (More)
Wikipedia

Topic mentions per year

Topic mentions per year

1980-2018
010020030019802018

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2017
2017
There is widespread sentiment that fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) are not… (More)
  • table 1
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2013
Highly Cited
2013
Stochastic gradient descent is popular for large scale optimization but has slow convergence asymptotically due to the inherent… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2012
Highly Cited
2012
Chapter 1 strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of a… (More)
  • table 1
  • table 2
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2011
Highly Cited
2011
Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve stateof-the-art performance on a variety of machine… (More)
  • figure 2
  • figure 4
  • figure 3
  • figure 5
Is this relevant?
Highly Cited
2011
Highly Cited
2011
We provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of… (More)
Is this relevant?
Highly Cited
2010
Highly Cited
2010
<lb>With the increase in available data parallel machine learning has become an in-<lb>creasingly pressing problem. In this paper… (More)
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2009
Highly Cited
2009
The SGD-QN algorithm is a stochastic gradient descent algorithm that makes careful use of secondorder information and splits the… (More)
  • table 1
  • table 2
  • figure 1
  • table 3
  • figure 2
Is this relevant?
Highly Cited
2008
Highly Cited
2008
We present a stochastic gradient descent optimisation method for image registration with adaptive step size prediction. The… (More)
  • figure 1
  • table 1
  • figure 2
  • figure 4
  • figure 3
Is this relevant?
Highly Cited
2004
Highly Cited
2004
Linear prediction methods, such as least squares for regression, logistic regression and support vector machines for… (More)
  • figure 1
  • figure 2
Is this relevant?
Highly Cited
1999
Highly Cited
1999
Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive… (More)
  • figure 1
  • table 1
  • figure 2
  • figure 3
  • table 2
Is this relevant?