# On the momentum term in gradient descent learning algorithms

@article{Qian1999OnTM, title={On the momentum term in gradient descent learning algorithms}, author={N. Qian}, journal={Neural networks : the official journal of the International Neural Network Society}, year={1999}, volume={12 1}, pages={ 145-151 } }

A momentum term is usually included in the simulations of connectionist learning algorithms. Although it is well known that such a term greatly improves the speed of learning, there have been few rigorous studies of its mechanisms. In this paper, I show that in the limit of continuous time, the momentum parameter is analogous to the mass of Newtonian particles that move through a viscous medium in a conservative force field. The behavior of the system near a local minimum is equivalent to a set… CONTINUE READING

965 Citations

Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method

- Mathematics, Medicine
- 2004

- 67
- Highly Influenced
- Open Access

A Global Minimization Algorithm Based on a Geodesic of a Lagrangian Formulation of Newtonian Dynamics

- Mathematics, Computer Science
- 2007

- 2
- Open Access

On the influence of momentum acceleration on online learning

- Computer Science, Mathematics
- 2016

- 20
- Highly Influenced
- Open Access

Convergence of Cyclic and Almost-Cyclic Learning With Momentum for Feedforward Neural Networks

- Computer Science, Medicine
- 2011

- 25
- Open Access

Near optimal step size and momentum in gradient descent for quadratic functions

- Mathematics
- 2017

- 2
- Highly Influenced
- Open Access

#### References

##### Publications referenced by this paper.

SHOWING 1-10 OF 15 REFERENCES

Learning to Solve Random-Dot Stereograms of Dense and Transparent Surfaces with Recurrent Backpropagation

- Computer Science
- 1989

- 30
- Highly Influential
- Open Access

Predicting the secondary structure of globular proteins using neural network models.

- Biology, Medicine
- 1988

- 1,208
- Open Access

Learning internal representations by error propagation

- Mathematics, Computer Science
- 1986

- 17,697
- Highly Influential
- Open Access

Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins.

- Chemistry, Medicine
- 1978

- 3,705

Parallel distributed processing (Vol

- 1986