Corpus ID: 231632727

Regularized Policies are Reward Robust

@inproceedings{Husain2021RegularizedPA,
  title={Regularized Policies are Reward Robust},
  author={H. Husain and K. Ciosek and Ryota Tomioka},
  booktitle={AISTATS},
  year={2021}
}
Entropic regularization of policies in Reinforcement Learning (RL) is a commonly used heuristic to ensure that the learned policy explores the state-space sufficiently before overfitting to a local optimal policy. The primary motivation for using entropy is for exploration and disambiguating optimal policies; however, the theoretical effects are not entirely understood. In this work, we study the more general regularized RL objective and using Fenchel duality; we derive the dual problem which… Expand

Figures from this paper

References

SHOWING 1-10 OF 45 REFERENCES
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
  • 7
  • Highly Influential
  • PDF
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
  • 6
  • PDF
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
  • 248
  • PDF
Variational Inference for Policy Search in changing situations
  • 75
  • PDF
Understanding the impact of entropy on policy optimization
  • 59
  • Highly Influential
  • PDF
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
  • 1,312
  • PDF
Provably Efficient Maximum Entropy Exploration
  • 61
  • Highly Influential
  • PDF
A Divergence Minimization Perspective on Imitation Learning Methods
  • 39
  • Highly Influential
  • PDF
Generative Adversarial Imitation Learning
  • 1,021
  • PDF
Equivalence Between Policy Gradients and Soft Q-Learning
  • 179
  • Highly Influential
  • PDF
...
1
2
3
4
5
...