Markus Thill

Learn More
Learning complex game functions is still a difficult task. We apply temporal difference learning (TDL), a well-known variant of the reinforcement learning approach, in combination with n-tuple networks to the game Connect-4. Our agent is trained just by self-play. It is able, for the first time, to consistently beat the optimal-playing Minimax agent (in(More)
Systems that learn to play board games are often trained by self-play on the basis of temporal difference (TD) learning. Successful examples include Tesauro's well known TD-Gammon and Lucas' Othello agent. For other board games of moderate complexity like Connect Four, we found in previous work that a successful system requires a very rich initial feature(More)
  • 1