Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

  title={Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping},
  author={Rich Caruana and Steve Lawrence and C. Lee Giles},
The conventional wisdom is that backprop nets with excess hi dden units generalize poorly. We show that nets with excess capacity ge neralize well when trained with backprop and early stopping. Experim nts suggest two reasons for this: 1) Overfitting can vary significant ly i different regions of the model. Excess capacity allows better fit to reg ions of high non-linearity, and backprop often avoids overfitting the re gions of low non-linearity. 2) Regardless of size, nets learn task subco… CONTINUE READING
Highly Cited
This paper has 453 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 216 extracted citations

454 Citations

Citations per Year
Semantic Scholar estimates that this publication has 454 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 15 references

Bartlett . For valid generalization the size of the weights is more important than the size of the network

L. Peter
Advances in Neural Information Processing Systems • 1997

On Bias Plus Variance

Neural Computation • 1997
View 1 Excerpt

On overfitting and the effective number of hi dden units

A. Weigend
Proceedings of the 1993 Connectionist Models Summer School , pages • 1993
View 1 Excerpt

Similar Papers

Loading similar papers…