diffGrad: An Optimization Method for Convolutional Neural Networks

@article{Dubey2020diffGradAO,
  title={diffGrad: An Optimization Method for Convolutional Neural Networks},
  author={S. Dubey and Soumendu Chakraborty and Swalpa Kumar Roy and Snehasis Mukherjee and S. K. Singh and B. Chaudhuri},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  volume={31},
  pages={4500-4511}
}
Stochastic gradient descent (SGD) is one of the core techniques behind the success of deep neural networks. The gradient provides information on the direction in which a function has the steepest rate of change. The main problem with basic SGD is to change by equal-sized steps for all parameters, irrespective of the gradient behavior. Hence, an efficient way of deep network optimization is to have adaptive step sizes for each parameter. Recently, several attempts have been made to improve… Expand
Exploiting Adam-like Optimization Algorithms to Improve the Performance of Convolutional Neural Networks
AdaDiffGrad: An Adaptive Batch Size Implementation Technique for DiffGrad Optimization Method.
Painless step size adaptation for SGD
Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks
...
1
2
...

References

SHOWING 1-10 OF 57 REFERENCES
On the Convergence of Adam and Beyond
Adam: A Method for Stochastic Optimization
On the importance of initialization and momentum in deep learning
signSGD: compressed optimisation for non-convex problems
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
LSTM: A Search Space Odyssey
Self-Normalizing Neural Networks
ImageNet classification with deep convolutional neural networks
...
1
2
3
4
5
...