Rational and Convergent Learning in Stochastic Games

Abstract

This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We examine existing reinforcement learning algorithms according to these two properties and notice that they fail to simultaneously meet both criteria. We then contribute a new learning algorithm, WoLF policy hillclimbing, that is based on a simple principle: “learn quickly while losing, slowly while winning.” The algorithm is proven to be rational and we present empirical results for a number of stochastic games showing the algorithm converges.

Extracted Key Phrases

5 Figures and Tables

01020'02'04'06'08'10'12'14'16
Citations per Year

223 Citations

Semantic Scholar estimates that this publication has 223 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Bowling2001RationalAC, title={Rational and Convergent Learning in Stochastic Games}, author={Michael H. Bowling and Manuela M. Veloso}, booktitle={IJCAI}, year={2001} }