Local Gain Adaptation in Stochastic Gradient Descent

Abstract

Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton’s work on linear systems to the general, nonlinear case. The resulting online algorithms are computationally little more expensive than other acceleration techniques, do not assume statistical independence between successive training patterns, and do not require an arbitrary smoothing parameter. In our benchmark experiments, they consistently outperform other acceleration methods, and show remarkable robustness when faced with noni.i.d. sampling of the input space.

Extracted Key Phrases

7 Figures and Tables

Statistics

02040'00'02'04'06'08'10'12'14'16
Citations per Year

205 Citations

Semantic Scholar estimates that this publication has 205 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Schraudolph1999LocalGA, title={Local Gain Adaptation in Stochastic Gradient Descent}, author={Nicol N. Schraudolph}, year={1999} }