Train faster, generalize better: Stability of stochastic gradient descent

@inproceedings{Hardt2016TrainFG,
  title={Train faster, generalize better: Stability of stochastic gradient descent},
  author={Moritz Hardt and Benjamin Recht and Yoram Singer},
  booktitle={ICML},
  year={2016}
}
We show that parametric models trained by a stochastic gradient method (SGM) with few iterations have vanishing generalization error. We prove our results by arguing that SGM is algorithmically stable in the sense of Bousquet and Elisseeff. Our analysis only employs elementary tools from convex and continuous optimization. We derive stability bounds for both convex and non-convex optimization under standard Lipschitz and smoothness assumptions. Applying our results to the convex case, we… CONTINUE READING

Similar Papers

Citations

Publications citing this paper.
SHOWING 1-10 OF 245 CITATIONS, ESTIMATED 90% COVERAGE

Data-Dependent Stability of Stochastic Gradient Descent

VIEW 24 EXCERPTS
CITES BACKGROUND, METHODS & RESULTS
HIGHLY INFLUENCED

Gradient Diversity: a Key Ingredient for Scalable Distributed Learning

  • AISTATS
  • 2018
VIEW 9 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Stability and Convergence Trade-off of Iterative Optimization Algorithms

VIEW 11 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Gradient Diversity Empowers Distributed Learning

VIEW 8 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2015
2019

CITATION STATISTICS

  • 42 Highly Influenced Citations

  • Averaged 69 Citations per year over the last 3 years

References

Publications referenced by this paper.
SHOWING 1-10 OF 43 REFERENCES