Learning to predict by the methods of temporal differences

@article{Sutton1988LearningTP,
  title={Learning to predict by the methods of temporal differences},
  author={Richard S. Sutton},
  journal={Machine Learning},
  year={1988},
  volume={3},
  pages={9-44}
}
This article introduces a class of incremental learning procedures specialized for prediction-that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporal-difference methods have been used in Samuel's… CONTINUE READING
Highly Influential
This paper has highly influenced 343 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 5,889 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.
2,582 Citations
32 References
Similar Papers

Citations

Publications citing this paper.

5,889 Citations

0200400'87'94'02'10'18
Citations per Year
Semantic Scholar estimates that this publication has 5,889 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 32 references

Temporal credit assignment in reinforcement learning

  • R. S. Sutton
  • Doctoral dissertation, Department of Computer and…
  • 1984
Highly Influential
3 Excerpts

A neuronal model of classical conditioning (Air Force Wright Aeronautical Laboratories Technical Report 87-1139)

  • A. H. Klopf
  • 1987
2 Excerpts

A temporal-difference model of classical conditioning

  • R. S. Sutton, A. G. Barto
  • Proceedings of the Ninth Annual Conference of the…
  • 1987

Temporal primacy overrides prior training in serial compound conditioning of the rabbit’s nictitating membrane response

  • E. J. Kehoe, B. G. Schreurs, P. Graham
  • Animal Learning and Behavior,
  • 1987

Escaping brittleness: The possibilities of generalpurpose learning algorithms applied to parallel rule-based systems

  • J. H. Holland
  • 1986

Similar Papers

Loading similar papers…