Modelling Cooperation in Network Games with Spatio-Temporal Complexity

  title={Modelling Cooperation in Network Games with Spatio-Temporal Complexity},
  author={Michiel A. Bakker and Richard Everett and Laura Weidinger and Iason Gabriel and William S. Isaac and Joel Z. Leibo and Edward Hughes},
The real world is awash with multi-agent problems that require collective action by self-interested agents, from the routing of packets across a computer network to the management of irrigation systems. Such systems have local incentives for individuals, whose behavior has an impact on the global outcome for the group. Given appropriate mechanisms describing agent interaction, groups may achieve socially beneficial outcomes, even in the face of short-term selfish incentives. In many cases… Expand

Figures and Tables from this paper


Multi-agent Reinforcement Learning in Sequential Social Dilemmas
This work analyzes the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network on two Markov games and characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Expand
Value-Decomposition Networks For Cooperative Multi-Agent Learning
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. Expand
Adaptive Mechanism Design: Learning to Promote Cooperation
A rule for automatically learning how to create the right incentives by considering the players’ anticipated parameter updates leads to cooperation with high social welfare in matrix games in which the agents would otherwise learn to defect with high probability. Expand
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
Empirical results demonstrate that influence leads to enhanced coordination and communication in challenging social dilemma environments, dramatically increasing the learning curves of the deep RL agents, and leading to more meaningful learned communication protocols. Expand
Collective Intelligence, Data Routing and Braess' Paradox
This work considers the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system (MAS) so that they work synergistically to maximize a global utility and derives an algorithm whose ideal version should have better performance than that of having all agents use the ISPA, even in the infinitesimal limit. Expand
Maintaining cooperation in complex social dilemmas using deep reinforcement learning
This work shows how to modify modern reinforcement learning methods to construct agents that act in ways that are simple to understand, nice, provokable, and forgiving, and shows both theoretically and experimentally that such agents can maintain cooperation in Markov social dilemmas. Expand
A Deep Q-Network for the Beer Game with Partial Information
A variant of the Deep Q-Network algorithm is developed that provides good solutions even if the agents do not follow a rational policy in the beer game and can be extended to other decentralized multi-agent cooperative games with partially observed information. Expand
Consequentialist conditional cooperation in social dilemmas with imperfect information
It is shown that in a large class of games good strategies can be constructed by conditioning one's behavior solely on outcomes (ie. one's past rewards) and called consequentialist conditional cooperation. Expand
Emergent Tool Use From Multi-Agent Autocurricula
This work finds clear evidence of six emergent phases in agent strategy in the authors' environment, each of which creates a new pressure for the opposing team to adapt, and compares hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests. Expand
Learning Reciprocity in Complex Sequential Social Dilemmas
This work presents a general online reinforcement learning algorithm that displays reciprocal behavior towards its co-players and shows that it can induce pro-social outcomes for the wider group when learning alongside selfish agents, both in a $2$-player Markov game, and in intertemporal social dilemmas. Expand