• Publications
  • Influence
Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems
TLDR
An incentive-design method which modifies agents' rewards in non-cooperative multi-agent systems that results in independent, self-interested agents choosing actions that produce optimal system outcomes in strategic settings is proposed.
Learning in Nonzero-Sum Stochastic Games with Potentials
TLDR
This paper introduces a new generation of MARL learners that can handle nonzero-sum payoff structures and continuous settings and proves theoretically the learning method, SPot-AC, enables independent agents to learn Nash equilibrium strategies in polynomial time.
Decentralised Learning in Systems with Many, Many Strategic Agents
TLDR
This paper proposes a learning protocol that is guaranteed to converge to equilibrium policies even when the number of agents is extremely large, and shows convergence to Nash-equilibrium policies in applications from economics and control theory with thousands of strategically interacting agents.
Multi-Agent Determinantal Q-Learning
TLDR
The proposed multi-agent determinantal Q-learning method generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks and has been demonstrated when compared with the state-of-the-art.
Stochastic Potential Games
  • D. Mguni
  • Computer Science, Economics
    ArXiv
  • 27 May 2020
TLDR
This paper identifies a subset of SGs known as stochastic potential games (SPGs) for which the (Markov perfect) Nash equilibrium can be computed tractably and in polynomial time, and shows that SGs with the potential property are P-Complete.
Modelling Behavioural Diversity for Learning in Open-Ended Games
TLDR
By incorporating the diversity metric into best-response dynamics, this work develops diverse fictitious play and diverse policy-space response oracle for solving normalform games and open-ended games and proves the uniqueness of the diverse best response and the convergence of the algorithms on two-player games.
Settling the Variance of Multi-Agent Policy Gradients
TLDR
A rigorous analysis of policy gradient methods is offered by quantifying the contributions of the number of agents and agents’ explorations to the variance of MAPG estimators and derives the optimal baseline (OB) that achieves the minimal variance.
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games
TLDR
It is derived that computing an approximate Markov Perfect Equilibrium (MPE) in a finite-state discounted Stochastic Game within the exponential precision is PPADcomplete, indicating that finding an MPE in SGs is highly unlikely to be NP-hard unless NP=co-NP.
A Viscosity Approach to Stochastic Differential Games of Control and Stopping Involving Impulsive Control
This paper analyses a stochastic differential game of control and stopping in which one of the players modifies a diffusion process using impulse controls, an adversary then chooses a stopping time
Online Double Oracle
TLDR
This paper proposes new learning algorithms for solving two-player zero-sum normal-form games where the number of pure strategies is prohibitively large and ODO is rationale in the sense that each agent in ODO can exploit strategic adversary with a regret bound of O.
...
...