• Corpus ID: 235265637

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

  title={Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment},
  author={Tianze Zhou and Fubiao Zhang and Kun Shao and Kai Li and Wenhan Huang and Jun Luo and Weixun Wang and Yaodong Yang and Hangyu Mao and Bin Wang and Dong Li and Wulong Liu and Jianye Hao},
Extending transfer learning to cooperative multi-agent reinforcement learning (MARL) has recently received much attention. In contrast to the single-agent setting, the coordination indispensable in cooperative MARL constrains each agent’s policy. However, existing transfer methods focus exclusively on agent policy and ignores coordination knowledge. We propose a new architecture that realizes robust coordination knowledge transfer through appropriate decomposition of the overall coordination… 

Figures and Tables from this paper

Researches advanced in multi-agent credit assignment in reinforcement learning

The main challenges in multi-agent credit assignment (MACA) with their related solutions, current defects of algorithms about these challenges, and prospects the possible future development direction of the MACA are summarized.

Multi-Agent Policy Transfer via Task Relationship Modeling

This paper proposes to learn effect-based task representations as a common space of tasks, using an alternatively fixed training scheme, and demonstrates that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

A novel interactiOn Pattern disenTangling (OPT) method, to disentangle not only the joint value function into agent-wise value functions for decentralized execution, but also the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a sub-group of the entities.

QGNN: Value Function Factorisation with Graph Neural Networks

The results show that QGNN outperforms state-of-the-art value factorisation baselines consistently, and introduces a permutation invariant mixer which is able to match the performance of other methods, even with significantly fewer parameters.

Identification of Intelligence Requirements of Military Surveillance for a WSN Framework and Design of a Situation Aware Selective Resource Use Algorithm

The aim of this work is to identify intelligence requirements of military surveillance for a WSN framework by designing and implementing an algorithm to compute the area under attack, communicated nearest neighbor nodes to carry out surveillance under attack.



Learning Transferable Cooperative Behavior in Multi-Agent Teams

This work proposes to create a shared agent-entity graph, where agents and environmental entities form vertices, and edges exist between the vertices which can communicate with each other, and shows that the learned policies quickly transfer to scenarios with different team sizes along with strong zero-shot generalization performance.

UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers

This paper makes the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing a single architecture to fit tasks with different observation and action configuration requirements, using a transformer-based model to generate a flexible policy by decoupling the policy distribution from the intertwined input observation.

Multi-Agent Determinantal Q-Learning

The proposed multi-agent determinantal Q-learning method generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks and has been demonstrated when compared with the state-of-the-art.

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

Extensive experiments demonstrate that the theoretically derive a general formula of Q_{tot} in terms of $Q^{i}$, based on which a multi-head attention formation to approximate $Q_{Tot}$ can naturally implement, resulting in not only a refined representation of $Tot$ with an agent-level attention mechanism, but also a tractable maximization algorithm of decentralized policies.

QPLEX: Duplex Dueling Multi-Agent Q-Learning

A novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function and encodes the IGM principle into the neural network architecture and thus enables efficient value function learning.

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems

A taxonomy of solutions for the general knowledge reuse problem is defined, providing a comprehensive discussion of recent progress on knowledge reuse in Multiagent Systems (MAS) and of techniques for knowledge reuse across agents (that may be actuating in a shared environment or not).

From Few to More: Large-scale Dynamic Multiagent Curriculum Learning

A novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents, and proposes three transfer mechanisms across curricula to accelerate the learning process.

Value-Decomposition Networks For Cooperative Multi-Agent Learning

This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.

Action Semantics Network: Considering the Effects of Actions in Multiagent Systems

A novel network architecture, named Action Semantics Network (ASN), is proposed that characterizes different actions' influence on other agents using neural networks based on the action semantics between them and can be easily combined with existing deep reinforcement learning algorithms to boost their performance.