Temporal Difference Learning in Continuous Time and Space

  title={Temporal Difference Learning in Continuous Time and Space},
  author={Kenji Doya},
A continuous-time, continuous-state version of the temporal difference (TD) algorithm is derived in order to facilitate the application of reinforcement learning to real-world control tasks and neurobiological modeling. An optimal nonlinear feedback control law was also derived using the derivatives of the value function. The performance of the algorithms was tested in a task of swinging up a pendulum with limited torque. Both the "critic" that specifies the paths to the upright position and… CONTINUE READING


Publications citing this paper.
Showing 1-10 of 55 extracted citations


Publications referenced by this paper.
Showing 1-10 of 10 references

A model of how the basal ganglia generate and use neural signlas that predict renforcement

  • Houk, C. J., J. L. Adams, A. G. Barto
  • Houk, J. C., Davis, J. L., and Beiser, D. G…
  • 1994
1 Excerpt

Advantage updating

  • III L.C. Baird
  • Technical Report WL-TR-93-1146, Wright Laboratory…
  • 1993
1 Excerpt

Similar Papers

Loading similar papers…