# Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

• Published 25 July 2022
• Computer Science
• ArXiv
While single-agent policy optimization in a fixed environment has attracted a lot of research attention recently in the reinforcement learning community, much less is known theoretically when there are multiple agents playing in a potentially competitive environment. We take steps forward by proposing and analyzing new fictitious play policy optimization algorithms for two-player zero-sum Markov games with structured but unknown transitions. We consider two classes of transition structures…

