Policy Shaping: Integrating Human Feedback with Reinforcement Learning

  title={Policy Shaping: Integrating Human Feedback with Reinforcement Learning},
  author={Shane Griffith and Kaushik Subramanian and Jonathan Scholz and Charles Lee Isbell and Andrea Lockerd Thomaz},
A long term goal of Interactive Reinforcement Learning is to incorporate nonexpert human feedback to solve complex tasks. Some state-of -the-art methods have approached this problem by mapping human information t o rewards and values and iterating over them to compute better control polici es. In this paper we argue for an alternate, more effective characterization of human feedback: Policy Shaping. We introduce Advise, a Bayesian approach that attempts to maximize the information gained from… CONTINUE READING
Highly Cited
This paper has 92 citations. REVIEW CITATIONS

5 Figures & Tables



Citations per Year

92 Citations

Semantic Scholar estimates that this publication has 92 citations based on the available data.

See our FAQ for additional information.