# Adam: A Method for Stochastic Optimization

@article{Kingma2015AdamAM, title={Adam: A Method for Stochastic Optimization}, author={Diederik P. Kingma and Jimmy Ba}, journal={CoRR}, year={2015}, volume={abs/1412.6980} }

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. [...] Key Method The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence…Expand Abstract

#### Supplemental Content

GITHUB REPO

Via Papers with Code

This repository analyzes the performance of Adam optimizer while comparing with others.

#### Paper Mentions

NEWS ARTICLE

BLOG POST

46,810 Citations

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

- Computer Science, Mathematics
- 2019

84

On Adam Trained Models and a Parallel Method to Improve the Generalization Performance

- Computer Science
- 2018

1

Convergence Guarantees for RMSProp and ADAM in Non-Convex Optimization and an Empirical Comparison to Nesterov Acceleration

- Computer Science, Mathematics
- 2018

26

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

- Computer Science, Mathematics
- 2019

9

#### References

##### Publications referenced by this paper.

SHOWING 1-10 OF 29 REFERENCES

Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods

- Mathematics, Computer Science
- 2014

71

Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning

- Mathematics, Computer Science
- 2011

429

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

- Computer Science, Mathematics
- 2014

799