Towards Painless Policy Optimization for Constrained MDPs

  title={Towards Painless Policy Optimization for Constrained MDPs},
  author={Arushi Jain and Sharan Vaswani and Reza Babanezhad and Csaba Szepesvari and Doina Precup},
  booktitle={Conference on Uncertainty in Artificial Intelligence},
We study policy optimization in an infinite horizon, γ -discounted constrained Markov decision process (CMDP). Our objective is to return a policy that achieves large expected reward with a small constraint violation. We consider the online setting with linear function approximation and assume global access to the corresponding features. We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of… 

