Based on an observation about the diierent eeect of ensemble averaging on the bias and variance portion of the prediction error, we discuss training methodologies for ensembles of networks. We demonstrate the eeect of variance reduction and present a method of extrapolation to the limit of an innnite ensemble. A signiicant reduction of variance is obtained by averaging just over initial conditions of the neural networks, without varying architectures or training sets. The minimum of the ensemble prediction error is reached later than that of a single network. In the vicinity of the minimum, the ensemble prediction error appears to be atter than that of the single network, thus simplifying optimal stopping decision. The results are demonstrated on the sunspots data, where the predictions are among the best obtained, and on the 1993 energy prediction competition data-set B.