Beyond Rewards : Learning from Richer Supervision


Recently there has been some interest in the reinforcement learning community on learning from richer feedback from the environment rather than just a scalar reward signal. In this paper we look at the question of learning from sporadic instructions from a human. Instructions can take several forms, from complete specification of policies, to directing the… (More)

2 Figures and Tables


  • Presentations referencing similar topics