Learn More
In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (ÈÇÅÅÈ). These algorithms are based on ÈÇÅÅÈ, an algorithm introduced in a companion paper (Baxter & Bartlett, 2001), which computes biased estimates of the performance gradient in ÈÇÅÅÈs. The algorithm's chief advantages(More)
In this paper we present TDLEAF(), a variation on the TD() algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program " KnightCap " used TDLEAF() to learn its evaluation function while playing on Internet chess servers. The main success we report is that KnightCap improved from a 1650(More)
In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program " KnightCap " used TDLeaf(λ) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). The main success we report is(More)
Dots-and-Boxes is a child's game which remains analytically unsolved. We implement and evolve ar-tiicial neural networks to play this game, evaluating them against simple heuristic players. Our networks do not evaluate or predict the nal outcome of the game, but rather recommend moves at each stage. Superior generalisation of play by co-evolved populations(More)
In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(λ) and another less radical variant, TD-directed(λ). In particular, our chess program, " KnightCap, " used(More)
Sorting is one of the classic problems of computer science. Whilst well understood on sequential machines, the diversity of architectures amongst parallel systems means that algorithms do not perform uniformly on all platforms. This document describes the implementation of an radix based algorithm for sorting positive integers on a Fujitsu AP1000(More)
A distributed heap storage manager has been implemented on the Fujitsu AP1000 multicom-puter. The performance of various pre-fetching strategies is experimentally compared. Subjective programming beneets and objective performance beneets of up to 10% in pre-fetching are found for certain applications, but not for all. The performance beneets of pre-fetching(More)
  • 1