Munchausen Reinforcement Learning
@article{Vieillard2020MunchausenRL, title={Munchausen Reinforcement Learning}, author={Nino Vieillard and Olivier Pietquin and M. Geist}, journal={ArXiv}, year={2020}, volume={abs/2007.14430} }
Bootstrapping is a core mechanism in Reinforcement Learning (RL). Most algorithms, based on temporal differences, replace the true value of a transiting state by their current estimate of this value. Yet, another estimate could be leveraged to bootstrap RL: the current policy. Our core contribution stands in a very simple idea: adding the scaled log-policy to the immediate reward. We show that slightly modifying Deep Q-Network (DQN) in that way provides an agent that is competitive with… CONTINUE READING
Figures and Tables from this paper
5 Citations
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning
- Computer Science
- NeurIPS
- 2020
- PDF
References
SHOWING 1-10 OF 40 REFERENCES
Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning
- Computer Science, Mathematics
- NeurIPS
- 2019
- 8
- PDF
Fully Parameterized Quantile Function for Distributional Reinforcement Learning
- Computer Science, Mathematics
- NeurIPS
- 2019
- 12
- PDF
Implicit Quantile Networks for Distributional Reinforcement Learning
- Computer Science, Mathematics
- ICML
- 2018
- 130
- Highly Influential
- PDF
Taming the Noise in Reinforcement Learning via Soft Updates
- Computer Science, Mathematics
- UAI
- 2016
- 175
- PDF
Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
- Computer Science
- ICML
- 2019
- 13
- PDF
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
- Computer Science, Mathematics
- IJCAI
- 2018
- 215
- PDF
A Comparative Analysis of Expected and Distributional Reinforcement Learning
- Computer Science, Mathematics
- AAAI
- 2019
- 31
- PDF