Reinforcement learning from simultaneous human and MDP reward

@inproceedings{Knox2012ReinforcementLF,
  title={Reinforcement learning from simultaneous human and MDP reward},
  author={W. Bradley Knox and Peter Stone},
  booktitle={AAMAS},
  year={2012}
}
As computational agents are increasingly used beyond research labs, their success will depend on their ability to learn new skills and adapt to their dynamic, complex environments. If human users—without programming skills— can transfer their task knowledge to agents, learning can accelerate dramatically, reducing costly trials. The tamer framework guides the design of agents whose behavior can be shaped through signals of approval and disapproval, a natural form of human feedback. More… CONTINUE READING
Highly Cited
This paper has 107 citations. REVIEW CITATIONS

Citations

Publications citing this paper.
Showing 1-10 of 65 extracted citations

108 Citations

0102030'13'15'17
Citations per Year
Semantic Scholar estimates that this publication has 108 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.

Similar Papers

Loading similar papers…