• Published 17 May 2022
• Computer Science
• ArXiv
Variance-reduced gradient estimators for policy gradient meth- ods have been one of the main focus of research in the reinforcement learning in recent years as they allow acceleration of the estimation process. We propose a variance-reduced policy-gradient method, called SHARP, which incorporates second- order information into stochastic gradient descent (SGD) using momentum with a time-varying learning rate. SHARP algo- rithm is parameter-free, achieving (cid:15) -approximate ﬁrst-order…
### Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning

• Computer Science
• 2022
A novel decentralized natural policy gradient method, dubbed Momentum-based Decentralized Natural Policy Gradient (MDNPG), is proposed, which incorporates natural gradient, momentum-based variance reduction, and gradient tracking into the decentralized stochastic gradient ascent framework.

