• Publications
  • Influence
Temporal difference learning and TD-Gammon
TLDR
Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, checkers, Othello, and backgammon has been widely regarded as an ideal testing ground for exploring a variety of concepts and approaches in artificial intelligence and machine learning. Expand
  • 1,120
  • 68
Practical Issues in Temporal Difference Learning
This paper examines whether temporal difference methods for training connectionist networks, such as Sutton's TD(λ) algorithm, can be successfully applied to complex real-world problems. A number ofExpand
  • 423
  • 47
Agent-Human Interactions in the Continuous Double Auction
TLDR
We describe a series of laboratory experiments that, for the first time, allow human subjects to interact with software bidding agents in a CDA. Expand
  • 258
  • 39
  • PDF
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
  • G. Tesauro
  • Computer Science
  • Neural Computation
  • 1 March 1994
TLDR
A neural network that is able to learn to play backgammon solely by playing against itself and learning from the results, based on the TD() reinforcement learning algorithm. Expand
  • 768
  • 37
  • PDF
Temporal Difference Learning and TD-Gammon
  • G. Tesauro
  • Computer Science
  • J. Int. Comput. Games Assoc.
  • 1995
TLDR
We provide an abstract, selectively u§ing the author's formulations: "The article presents a game-learning program called TD-GAMMON that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome. Expand
  • 640
  • 37
  • PDF
R3: Reinforced Ranker-Reader for Open-Domain Question Answering
TLDR
We propose a new pipeline for open-domain QA with a Ranker component that learns to rank retrieved passages in terms of likelihood of extracting the ground-truth answer to a given question. Expand
  • 141
  • 32
Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering
TLDR
A popular recent approach to answering open-domain questions is to first search for question-related passages and then apply reading comprehension models to extract answers. Expand
  • 98
  • 26
  • PDF
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation
TLDR
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and asequence of natural language tokens. Expand
  • 152
  • 24
  • PDF
Utility functions in autonomic systems
TLDR
We present a distributed architecture, implemented in a realistic prototype data center, that demonstrates how utility functions can enable a collection of autonomic elements to continually optimize the use of computational resources in a dynamic, heterogeneous environment. Expand
  • 346
  • 22
  • PDF
Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference
TLDR
We propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. Expand
  • 133
  • 22
  • PDF