Value-Decomposition Networks For Cooperative Multi-Agent Learning
- Peter Sunehag, Guy Lever, T. Graepel
- Computer ScienceAdaptive Agents and Multi-Agent Systems
- 16 June 2017
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
The Mechanics of n-Player Differentiable Games
- D. Balduzzi, Sébastien Racanière, James Martens, Jakob N. Foerster, K. Tuyls, T. Graepel
- Computer ScienceInternational Conference on Machine Learning
- 15 February 2018
The key result is to decompose the second-order dynamics into two components, related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
- Marc Lanctot, V. Zambaldi, T. Graepel
- Computer ScienceNIPS
- 2 November 2017
An algorithm is described, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection, which generalizes previous ones such as InRL.
Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input
- Angeliki Lazaridou, K. Hermann, K. Tuyls, S. Clark
- Computer ScienceInternational Conference on Learning…
- 15 February 2018
It is found that the degree of structure found in the input data affects the nature of the emerged protocols, and thereby corroborate the hypothesis that structured compositional language is most likely to emerge when agents perceive the world as being structured.
Deep reinforcement learning with relational inductive biases
- V. Zambaldi, David Raposo, P. Battaglia
- Computer ScienceInternational Conference on Learning…
- 27 September 2018
The main contribution of this work is to introduce techniques for representing and reasoning about states in model-free deep reinforcement learning agents via relational inductive biases, which can offer advantages in efficiency, generalization, and interpretability.
Inequity aversion improves cooperation in intertemporal social dilemmas
- Edward Hughes, Joel Z. Leibo, T. Graepel
- EconomicsNeural Information Processing Systems
- 23 March 2018
It is found that inequity aversion improves temporal credit assignment for the important class of intertemporal social dilemmas and helps explain how large-scale cooperation may emerge and persist.
Inference of concise DTDs from XML data
- G. Bex, F. Neven, T. Schwentick, K. Tuyls
- Computer ScienceVery Large Data Bases Conference
- 1 September 2006
The algorithm iDTD (infer DTD) is presented, that learns SOREs from strings by first inferring an automaton by known techniques and then translating that automaton to a corresponding SORE, possibly by repairing the automaton when no equivalent SORE can be found.
Evolutionary Dynamics of Multi-Agent Learning: A Survey
- D. Bloembergen, K. Tuyls, Daniel Hennes, M. Kaisers
- Computer ScienceJournal of Artificial Intelligence Research
- 2015
This article surveys the dynamical models that have been derived for various multi-agent reinforcement learning algorithms, making it possible to study and compare them qualitatively, and provides a roadmap on the progress that has been achieved in analysing the evolutionary dynamics of multi- agent learning.
Relational Deep Reinforcement Learning
- V. Zambaldi, David Raposo, P. Battaglia
- Computer ScienceArXiv
- 5 June 2018
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception…
A multi-agent reinforcement learning model of common-pool resource appropriation
- J. Pérolat, Joel Z. Leibo, V. Zambaldi, Charlie Beattie, K. Tuyls, T. Graepel
- Computer ScienceNIPS
- 20 July 2017
This work studies the emergent behavior of groups of independently learning agents in a partially observed Markov game modeling common-pool resource appropriation and sheds light on the relationship between exclusion, sustainability, and inequality.
...
...