Early stopping

In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as… (More)
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2016
2016
We show that unconverged stochastic gradient descent can be interpreted as a procedure that samples from a nonparametric… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2008
Highly Cited
2008
Sampling is a popular way of scaling up machine learning algorithms to large datasets. The question often is how many samples are… (More)
Is this relevant?
Highly Cited
2007
Highly Cited
2007
In this paper, we study a family of gradient descent algorithms to approximate the regression function from Reproducing Kernel… (More)
Is this relevant?
Highly Cited
2005
Highly Cited
2005
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2000
Highly Cited
2000
The conventional wisdom is that backprop nets with excess hi dden units generalize poorly. We show that nets with excess capacity… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
1999
1999
We show that with a uniform prior on models having the same training error, early stopping at some fixed training error above the… (More)
  • figure 6
Is this relevant?
1999
1999
This paper addresses solutions to the problem of reaching agreement in the presence of faults. Whereas the need for agreement has… (More)
  • figure 3
  • figure 4
  • figure 2
  • figure 1
  • figure 5
Is this relevant?
Highly Cited
1998
Highly Cited
1998
Cross validation can be used to detect when overfitting starts during supervised training of a neural network; training is then… (More)
Is this relevant?
1992
1992
Finally, we indicate how to extend one of the previous protocols to be optimal in total bit (and message) complexity and number… (More)
Is this relevant?
Highly Cited
1990
Highly Cited
1990
Two different kinds of Byzantine Agreement for distributed systems with processor faults are defined and compared. The first is… (More)
Is this relevant?