Structural adaptation for sparsely connected MLP using Newton's method
A batch training algorithm for feed-forward networks is proposed which uses Newton’s method to estimate a vector of optimal learning factors, one for each hidden unit. Backpropagation, using this learning factor vector, is used to modify the hidden unit’s input weights. Linear equations are then solved for the network’s output weights. Elements of the new method’s Gauss-Newton Hessian matrix are shown to be weighted sums of elements from the total network’s Hessian. In several examples, the new method performs better than backpropagation and conjugate gradient, with similar numbers of required multiplies. The method performs as well as or better than Levenberg-Marquardt, with several orders of magnitude fewer multiplies due to the small size of its Hessian.