Learning with Random Learning Rates

@article{Blier2019LearningWR,
  title={Learning with Random Learning Rates},
  author={L. Blier and Pierre Wolinski and Y. Ollivier},
  journal={ArXiv},
  year={2019},
  volume={abs/1810.01322}
}
Hyperparameter tuning is a bothersome step in the training of deep learning models. One of the most sensitive hyperparameters is the learning rate of the gradient descent. We present the 'All Learning Rates At Once' (Alrao) optimization method for neural networks: each unit or feature in the network gets its own learning rate sampled from a random distribution spanning several orders of magnitude. This comes at practically no computational cost. Perhaps surprisingly, stochastic gradient descent… Expand
6 Citations
Learning Rate Optimisation of an Image Processing Deep Convolutional Neural Network
RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee
  • 1
  • PDF
Annealed Label Transfer for Face Expression Recognition
  • 8
  • Highly Influenced
  • PDF

References

SHOWING 1-10 OF 78 REFERENCES
Training Deep Networks without Learning Rates Through Coin Betting
  • 38
  • PDF
No more pesky learning rates
  • 363
  • PDF
The Marginal Value of Adaptive Gradient Methods in Machine Learning
  • 558
  • Highly Influential
  • PDF
Gradient-based Hyperparameter Optimization through Reversible Learning
  • 437
  • PDF
Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
  • 594
  • PDF
Practical Recommendations for Gradient-Based Training of Deep Architectures
  • Yoshua Bengio
  • Computer Science
  • Neural Networks: Tricks of the Trade
  • 2012
  • 1,296
  • PDF
Neural Architecture Search with Reinforcement Learning
  • 2,382
  • PDF
...
1
2
3
4
5
...