Policy Transfer using Value Function as Prior Information


This work proposes an approach based on reward shaping techniques in a reinforcement learning setting to approximate the optimal decision-making process (also called the optimal policy) in a desired task with a limited amount of data. We extract prior information from an existing family of policies have been used as a heuristic to help the construction of… (More)

