Overcoming incorrect knowledge in plan-based reward shaping

Abstract

Reward shaping has been shown to significantly improve an agent’s performance in reinforcement learning. Plan-based reward shaping is a successful approach in which a STRIPS plan is used in order to guide the agent to the optimal behaviour. However, if the provided knowledge is wrong, it has been shown the agent will take longer to learn the optimal policy… (More)
DOI: 10.1017/S026988891500017X

Topics

7 Figures and Tables

Slides referencing similar topics