Learning coefficient dependence on training set size

Abstract

-A rule for the selection of the learning coefficient, ~, for use in back propagation with batch training of neural networks is presented. The length of the error gradient is shown to increase as more training set examples are presented. This results in slow training or nonconvergence if ~I is not decreased as the number of input examples increases. The effect o f a momentum term is shown to allow a range of tf s to produce similar training rates. Two networks having identical topology are trained at different tasks, one with few training patterns (16) and one with many (192). Distinctly different values of ~ are shown to produce good training for the two networks. We propose selecting J? equal to 1.5 divided by the square root of the sum of the squares of the number of each input pattern type. Any group of similar inputs that map to identical outputs constitutes a pattern type. This rule produces a fixed value of t? that yields rapid training when coupled with a momentum coefficient of O. 9 for a wide variety of networks. Keywords--Alpha, Batch training, Coefficient, Eta, Momentum, Priority encoder, Training rate, Z-transform.

DOI: 10.1016/S0893-6080(05)80026-7

Cite this paper

@article{Eaton1992LearningCD, title={Learning coefficient dependence on training set size}, author={Harry A. C. Eaton and Tracy L. Olivier}, journal={Neural Networks}, year={1992}, volume={5}, pages={283-288} }