Corpus ID: 6628106

Adam: A Method for Stochastic Optimization

@article{Kingma2015AdamAM,
  title={Adam: A Method for Stochastic Optimization},
  author={Diederik P. Kingma and Jimmy Ba},
  journal={CoRR},
  year={2015},
  volume={abs/1412.6980}
}
  • Diederik P. Kingma, Jimmy Ba
  • Published 2015
  • Computer Science, Mathematics
  • CoRR
  • We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. [...] Key Method The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence…Expand Abstract
    On the Convergence of Adam and Beyond
    780
    A Dynamic Sampling Adaptive-SGD Method for Machine Learning
    A Sufficient Condition for Convergences of Adam and RMSProp
    45
    On Adam Trained Models and a Parallel Method to Improve the Generalization Performance
    1
    ADAPTIVE LEARNING RATE METHODS
    ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization
    9
    Adam revisited: a weighted past gradients perspective
    2
    SAdam: A Variant of Adam for Strongly Convex Functions
    8

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 29 REFERENCES
    Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods
    71
    On the importance of initialization and momentum in deep learning
    2431
    Revisiting Natural Gradient for Deep Networks
    197
    Auto-Encoding Variational Bayes
    8622
    No more pesky learning rates
    325
    A fast natural Newton method
    45
    Natural Gradient Works Efficiently in Learning
    2348
    On the momentum term in gradient descent learning algorithms
    • N. Qian
    • Computer Science, Mathematics
    • 1999
    933