Learning with Random Learning Rates
@article{Blier2019LearningWR, title={Learning with Random Learning Rates}, author={L. Blier and Pierre Wolinski and Y. Ollivier}, journal={ArXiv}, year={2019}, volume={abs/1810.01322} }
Hyperparameter tuning is a bothersome step in the training of deep learning models. One of the most sensitive hyperparameters is the learning rate of the gradient descent. We present the 'All Learning Rates At Once' (Alrao) optimization method for neural networks: each unit or feature in the network gets its own learning rate sampled from a random distribution spanning several orders of magnitude. This comes at practically no computational cost. Perhaps surprisingly, stochastic gradient descent… Expand
Supplemental Code
Github Repo
Via Papers with Code
Implementation of "Learning with Random Learning Rates" in PyTorch.
Figures, Tables, and Topics from this paper
6 Citations
Disentangling Adaptive Gradient Methods from Learning Rates
- Computer Science, Mathematics
- ArXiv
- 2020
- 6
- PDF
Learning Rate Optimisation of an Image Processing Deep Convolutional Neural Network
- 2020 International Conference on Power, Instrumentation, Control and Computing (PICC)
- 2020
RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee
- Computer Science, Mathematics
- ArXiv
- 2020
- 1
- PDF
Annealed Label Transfer for Face Expression Recognition
- Computer Science, Materials Science
- BMVC
- 2019
- 8
- Highly Influenced
- PDF
Discrimination of malignant from benign thyroid lesions through neural networks using FTIR signals obtained from tissues
- Medicine
- Analytical and Bioanalytical Chemistry
- 2021
Music Visualization Based on Spherical Projection With Adjustable Metrics
- Computer Science
- IEEE Access
- 2019
- PDF
References
SHOWING 1-10 OF 78 REFERENCES
Training Deep Networks without Learning Rates Through Coin Betting
- Computer Science, Mathematics
- NIPS
- 2017
- 38
- PDF
Online Learning Rate Adaptation with Hypergradient Descent
- Computer Science, Mathematics
- ICLR
- 2018
- 91
- PDF
The Marginal Value of Adaptive Gradient Methods in Machine Learning
- Computer Science, Mathematics
- NIPS
- 2017
- 558
- Highly Influential
- PDF
Gradient-based Hyperparameter Optimization through Reversible Learning
- Mathematics, Computer Science
- ICML
- 2015
- 437
- PDF
Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2017
- 594
- PDF
Practical Recommendations for Gradient-Based Training of Deep Architectures
- Computer Science
- Neural Networks: Tricks of the Trade
- 2012
- 1,296
- PDF
Dropout: a simple way to prevent neural networks from overfitting
- Computer Science
- J. Mach. Learn. Res.
- 2014
- 21,764
- PDF