Reinforcement learning from simultaneous human and MDP reward

  title={Reinforcement learning from simultaneous human and MDP reward},
  author={W. Bradley Knox and Peter Stone},
As computational agents are increasingly used beyond research labs, their success will depend on their ability to learn new skills and adapt to their dynamic, complex environments. If human users—without programming skills— can transfer their task knowledge to agents, learning can accelerate dramatically, reducing costly trials. The tamer framework guides the design of agents whose behavior can be shaped through signals of approval and disapproval, a natural form of human feedback. More… CONTINUE READING
Highly Cited
This paper has 107 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 65 extracted citations

108 Citations

Citations per Year
Semantic Scholar estimates that this publication has 108 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.

Similar Papers

Loading similar papers…