Corpus ID: 204509533

On Empirical Comparisons of Optimizers for Deep Learning

@article{Choi2019OnEC,
  title={On Empirical Comparisons of Optimizers for Deep Learning},
  author={Dami Choi and Christopher J. Shallue and Zachary Nado and Jaehoon Lee and Chris J. Maddison and George E. Dahl},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.05446}
}
Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search space may be the single most important factor explaining the rankings obtained by recent empirical comparisons in the literature. In fact, we show that these results can be contradicted when hyperparameter search spaces are changed. As tuning effort… Expand
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
On the Influence of Optimizers in Deep Learning-based Side-channel Analysis
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 68 REFERENCES
YellowFin and the Art of Momentum Tuning
DeepOBS: A Deep Learning Optimizer Benchmark Suite
The Marginal Value of Adaptive Gradient Methods in Machine Learning
On the Convergence of Adam and Beyond
Improving Generalization Performance by Switching from Adam to SGD
A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay
Critical Hyper-Parameters: No Random, No Cry
Adaptive Methods for Nonconvex Optimization
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
...
1
2
3
4
5
...