# Temporal difference learning

## Papers overview

2016

2016

- Journal of Machine Learning Research
- 2016

The temporal-difference methods TD(λ) and Sarsa(λ) form a core part of modern reinforcement learning.

2010

2010

- NIPS
- 2010

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with

Highly Cited

2009

Highly Cited

2009

- ICML
- 2009

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in

2008

2008

- ICML
- 2008

This paper extends many of the recent popular policy evaluation algorithms to a generalized framework that includes least-squares

2006

2006

- ICML
- 2006

We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems

Highly Cited

2002

Highly Cited

2002

- Machine Learning
- 2002

TD(λ) is a popular family of algorithms for approximate policy evaluation in large MDPs. TD(λ) works by incrementally updating

Highly Cited

1999

Highly Cited

1999

- ICML
- 1999

Excerpted from:Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University

Highly Cited

1999

Highly Cited

1999

- Automatica
- 1999

We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic

Highly Cited

1996

Highly Cited

1996

- Machine Learning
- 1996

We introduce two new temporal difference (TD) algorithms based on the theory of linear least-squares function approximation. We

Highly Cited

1996

Highly Cited

1996

- 1996

We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon

