On the Convergence of Adam and Beyond
- Sashank J. Reddi, Satyen Kale, Sanjiv Kumar
- Computer ScienceInternational Conference on Learning…
- 15 February 2018
It is shown that one cause for such failures is the exponential moving average used in the algorithms, and suggested that the convergence issues can be fixed by endowing such algorithms with `long-term memory' of past gradients.
Logarithmic regret algorithms for online convex optimization
- Elad Hazan, A. Agarwal, Satyen Kale
- Computer ScienceMachine-mediated learning
- 22 June 2006
Several algorithms achieving logarithmic regret are proposed, which besides being more general are also much more efficient to implement, and give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field.
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
- Sai Praneeth Karimireddy, Satyen Kale, M. Mohri, Sashank J. Reddi, S. Stich, A. Suresh
- Computer ScienceInternational Conference on Machine Learning
- 14 October 2019
This work obtains tight convergence rates for FedAvg and proves that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence, and proposes a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the ` client-drifts' in its local updates.
The Multiplicative Weights Update Method: a Meta-Algorithm and Applications
- Sanjeev Arora, Elad Hazan, Satyen Kale
- Computer ScienceTheory of Computing
- 1 May 2012
A simple meta-algorithm is presented that unifies many of these disparate algorithms and derives them as simple instantiations of the meta-Algorithm.
Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
- Elad Hazan, Satyen Kale
- Computer ScienceAnnual Conference Computational Learning Theory
- 12 June 2010
An algorithm which performs only gradient updates with optimal rate of convergence is given, which is equivalent to stochastic convex optimization with a strongly convex objective.
Adaptive Methods for Nonconvex Optimization
- M. Zaheer, Sashank J. Reddi, Devendra Singh Sachan, Satyen Kale, Sanjiv Kumar
- Computer ScienceNeural Information Processing Systems
- 2018
The result implies that increasing minibatch sizes enables convergence, thus providing a way to circumvent the non-convergence issues, and provides a new adaptive optimization algorithm, Yogi, which controls the increase in effective learning rate, leading to even better performance with similar theoretical guarantees on convergence.
Privacy, accuracy, and consistency too: a holistic solution to contingency table release
- B. Barak, Kamalika Chaudhuri, C. Dwork, Satyen Kale, Frank McSherry, Kunal Talwar
- Computer ScienceACM SIGACT-SIGMOD-SIGART Symposium on Principles…
- 11 June 2007
This work proposes a solution that provides strong guarantees for all three desiderata simultaneously, privacy, accuracy, and consistency among the tables, and applies equally well to the logical cousin of the contingency table, the OLAP cube.
SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning
- Sai Praneeth Karimireddy, Satyen Kale, M. Mohri, Sashank J. Reddi, S. Stich, A. Suresh
- Computer ScienceArXiv
- 14 October 2019
A new Stochastic Controlled Averaging algorithm (SCAFFOLD) which uses control variates to reduce the drift between different clients and it is proved that the algorithm requires significantly fewer rounds of communication and benefits from favorable convergence guarantees.
Projection-free Online Learning
- Elad Hazan, Satyen Kale
- Computer ScienceInternational Conference on Machine Learning
- 18 June 2012
This work presents efficient online learning algorithms that eschew projections in favor of much more efficient linear optimization steps using the Frank-Wolfe technique, and obtains a range of regret bounds for online convex optimization, with better bounds for specific cases such as stochastic online smooth conveX optimization.
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
- Alekh Agarwal, Daniel J. Hsu, Satyen Kale, J. Langford, Lihong Li, R. Schapire
- Computer ScienceInternational Conference on Machine Learning
- 3 February 2014
We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes one of K actions in response to the observed context, and observes the reward only for that…
...
...