• Publications
  • Influence
Temporal difference learning and TD-Gammon
Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, checkers, Othello, andExpand
  • 1,153
  • 75
Practical Issues in Temporal Difference Learning
This paper examines whether temporal difference methods for training connectionist networks, such as Sutton's TD(λ) algorithm, can be successfully applied to complex real-world problems. A number ofExpand
  • 485
  • 65
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
  • G. Tesauro
  • Computer Science
  • Neural Computation
  • 1 March 1994
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD() reinforcement learning algorithmExpand
  • 727
  • 37
  • PDF
Agent-Human Interactions in the Continuous Double Auction
The Continuous Double Auction (CDA) is the dominant market institution for real-world trading of equities, commodities, derivatives, etc. We describe a series of laboratory experiments that, for theExpand
  • 250
  • 37
  • PDF
Temporal Difference Learning and TD-Gammon
  • G. Tesauro
  • Computer Science
  • J. Int. Comput. Games Assoc.
  • 1995
We provide an abstract, selectively u§ing the author's formulations: "The article presents a game-learning program called TD-GAMMON. TD-GAMMON is a neural network that trains itself to be anExpand
  • 550
  • 34
  • PDF
R3: Reinforced Ranker-Reader for Open-Domain Question Answering
In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplifiedExpand
  • 130
  • 31
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: aExpand
  • 141
  • 24
  • PDF
Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering
A popular recent approach to answering open-domain questions is to first search for question-related passages and then apply reading comprehension models to extract answers. Existing methods usuallyExpand
  • 87
  • 24
  • PDF
Utility functions in autonomic systems
Utility functions provide a natural and advantageous framework for achieving self-optimization in distributed autonomic computing systems. We present a distributed architecture, implemented in aExpand
  • 341
  • 23
  • PDF
Analyzing Complex Strategic Interactions in Multi-Agent Systems
We develop a model for analyzing complex games with repeated interactions, for which a full game-theoretic analysis is intractable. Our approach treats exogenously specified, heuristic strategies,Expand
  • 165
  • 22
  • PDF