Corpus ID: 204904238

An Adaptive and Momental Bound Method for Stochastic Learning

@article{Ding2019AnAA,
  title={An Adaptive and Momental Bound Method for Stochastic Learning},
  author={Jianbang Ding and Xuancheng Ren and Ruixuan Luo and X. Sun},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.12249}
}
Training deep neural networks requires intricate initialization and careful selection of learning rates. The emergence of stochastic gradient optimization methods that use adaptive learning rates based on squared past gradients, e.g., AdaGrad, AdaDelta, and Adam, eases the job slightly. However, such methods have also been proven problematic in recent studies with their own pitfalls including non-convergence issues and so on. Alternative variants have been proposed for enhancement, such as… Expand
EAdam Optimizer: How $\epsilon$ Impact Adam
Decreasing scaling transition from adaptive gradient descent to stochastic gradient descent
  • Kun Zeng, Jinlan Liu, Zhixia Jiang, Dongpo Xu
  • Computer Science
  • ArXiv
  • 2021
SSD Object Detection Model Based on Multi-Frequency Feature Theory
Deep Residual 3D U-Net for Joint Segmentation and Texture Classification of Nodules in Lung

References

SHOWING 1-10 OF 43 REFERENCES
On the Convergence of Adam and Beyond
Adam: A Method for Stochastic Optimization
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
Improving Generalization Performance by Switching from Adam to SGD
Fixing Weight Decay Regularization in Adam
...
1
2
3
4
5
...