Temporal Difference Models: Model-Free Deep RL for Model-Based Control


Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world problems, even with off-policy algorithms such as Q-learning. A limiting factor in classic model-free RL is that the learning signal consists only of scalar rewards… (More)

4 Figures and Tables


  • Presentations referencing similar topics