Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Abstract

Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world problems, even with off-policy algorithms such as Q-learning. A limiting factor in classic model-free RL is that the learning signal consists only of scalar rewards… (More)

4 Figures and Tables

Topics

  • Presentations referencing similar topics