Temporal Difference Learning for Heuristic Search and Game Playing

@article{Beal2000TemporalDL,
  title={Temporal Difference Learning for Heuristic Search and Game Playing},
  author={D. F. Beal and M. Smith},
  journal={Inf. Sci.},
  year={2000},
  volume={122},
  pages={3-21}
}
  • D. F. Beal, M. Smith
  • Published 2000
  • Computer Science
  • Inf. Sci.
  • Abstract Temporal difference (TD) learning is a natural method of reinforcement learning that is particularly appropriate for learning in heuristic search and game playing. Sutton [Machine Learning 3 (1988) 9–44] introduced the TD(λ) method which is an elegant integration of supervised learning with TD learning. TD(λ) enabled Tesauro’s backgammon program to reach world championship standard. But it can be slow. Tesauro’s program was trained on 1 500 000 games. Recent work [D.F. Beal, M.C. Smith… CONTINUE READING
    26 Citations
    Temporal Coherence in TD-Learning for Strategic Board Games Case Study Report
    • Highly Influenced
    • PDF
    Online Adaptable Learning Rates for the Game Connect-4
    • 15
    • PDF
    Temporal difference learning with eligibility traces for the game connect four
    • 16
    • PDF
    Learning to Play: Reinforcement Learning and Games
    • 7
    Learning Time Allocation Using Neural Networks
    • 12
    Reinforcement learning in board games
    • 62
    • PDF
    Temporal difference learning
    • PDF
    The estimation of reward and value in reinforcement learning
    • Highly Influenced
    • PDF
    Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
    • 691
    • PDF

    References

    SHOWING 1-8 OF 8 REFERENCES
    Practical Issues in Temporal Difference Learning
    • 293
    • PDF
    TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
    • 766
    • PDF
    Natural Developments in Game Research
    • 29
    • PDF
    Increased rates of convergence through learning rate adaptation
    • 1,922
    Evaluation Tuning for Computer Chess: Linear Discriminant Methods
    • 16
    Learning Piece Values Using Temporal Differences
    • 46
    Machine Learning in Computer Chess: The Next Generation
    • 61
    • PDF
    Random Evaluations in Chess
    • 17