Temporal Abstraction in Reinforcement Learning

@inproceedings{Bacon2000TemporalAI,
  title={Temporal Abstraction in Reinforcement Learning},
  author={Pierre-Luc Bacon and Doina Precup and Multi-steps Boostrapping and Richard S. Sutton},
  year={2000}
}
Temporal-di erence (TD) learning can be used not just to predict rewards, as is commonly done in reinforcement learning, but also to predict states, i.e., to learn a model of the world's dynamics. We present theory and algorithms for intermixing TD models of the world at di erent levels of temporal abstraction within a single structure. Such multi-scale TD models can be used in model-based reinforcement-learning architectures and dynamic programming methods in place of conventional Markov… CONTINUE READING
Highly Influential
This paper has highly influenced 23 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 217 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, connections, and topics extracted from this paper.
124 Extracted Citations
76 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.

218 Citations

0102030'98'01'05'09'13'17
Citations per Year
Semantic Scholar estimates that this publication has 218 citations based on the available data.

See our FAQ for additional information.

Referenced Papers

Publications referenced by this paper.
Showing 1-10 of 76 references

The MAXQ method for hierar hi al reinfor ement learning

  • T. G. Dietteri h
  • 1998
Highly Influential
9 Excerpts

Hierar hi al ontrol and learning for Markov De ision Pro esses

  • R. Parr
  • 1998
Highly Influential
5 Excerpts

De omposition te hniques for planning in sto hasti

  • T. Dean, Lin, S.-H
  • 1995
Highly Influential
4 Excerpts

Reinfor ement learning methods for ontinuous

  • S. J. Bradtke, M. O. Du
  • 1995
Highly Influential
3 Excerpts

Manifesto for an evolutionary e onomi s of intelligen e

  • E. Baum
  • 1998
1 Excerpt

Multi-Value-Fun tions: EÆ

  • A. W. Moore, L. Baird, L. P. Kaelbling
  • 1998
1 Excerpt

Similar Papers

Loading similar papers…