Decentralization of Multiagent Policies by Learning What to Communicate

  title={Decentralization of Multiagent Policies by Learning What to Communicate},
  author={James Paulos and Steven W. Chen and Daigo Shishika and Vijay R. Kumar},
  journal={2019 International Conference on Robotics and Automation (ICRA)},
Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware… 

Figures from this paper

Learning Connectivity for Data Distribution in Robot Teams

This work proposes a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN), and trains the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and shows that it improves training stability compared to task-specific reward functions.

ForMIC: Foraging via Multiagent RL With Implicit Communication

This work proposes ForMIC, a distributed reinforcement learning MAF approach that endows agents with implicit communication abilities via their shared environment that outperforms existing state-of-the-art MAF algorithms in a set of experiments that vary team size, number and placement of resources, and key environmental dynamics not seen at training time.

Learning from My Partner's Actions: Roles in Decentralized Robot Teams

This work defines separate roles for each agent in a team of robots, so that teammates can correctly interpret the meaning behind their partner's actions and suggest that leveraging and alternating roles leads to performance comparable to teams that explicitly exchange messages.

Decentralized Multi-Agents by Imitation of a Centralized Controller

This work introduces a novel algorithm under the popular framework of centralized training, but decentralized execution, that obtains solutions to a multiagent problem with a single centralized joint-space learner, which is used to guide imitation learning for independent decentralized multi-agents.

Learning Decentralized Controllers for Robot Swarms with Graph Neural Networks

This work learns a single common local controller which exploits information from distant teammates using only local communication interchanges and applies this approach to a decentralized linear quadratic regulator problem and observes how faster communication rates and smaller network degree increase the value of multi-hop information.

The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning

This work presents a learning model that accommodates individual non-shared rewards and a differentiable communication channel that is common among all agents, and develops a learning algorithm that elicits the emergence of adversarial communications.

Neurosymbolic Transformers for Multi-Agent Communication

A novel algorithm is proposed that synthesizes a control policy that combines a programmatic communication policy used to generate the communication graph with a transformer policy network used to choose actions, forming a neurosymbolic transformer.

Coverage Control in Multi-Robot Systems via Graph Neural Networks

A decentralized control policy for the robots-realized via a Graph Neural Network-which uses inter-robot communication to leverage non-local information for control decisions and achieves a higher quality of coverage when compared to classical approaches that do not communicate.

Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation

This paper proposes an approach to learning sparse discrete communication through backpropagation in the context of MARL, in which agents are incentivized to communicate as little as possible while still achieving high reward, and develops a regularization-inspired message-length penalty term.

A Top-Down Approach to Attain Decentralized Multi-agents

This chapter demonstrates a method, centralized expert supervises multi-agents (CESMA), to obtain decentralized multi- agents through a top-down approach; it first obtain a solution with a centralized controller, and then decentralize this using imitation learning.



Learning Multiagent Communication with Backpropagation

A simple neural model is explored, called CommNet, that uses continuous communication for fully cooperative tasks and the ability of the agents to learn to communicate amongst themselves is demonstrated, yielding improved performance over non-communicative agents and baselines.

Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach

This work considers a scenario where no communication is available, and instead it learns local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem.

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks

Empirical results on two multi-agent learning problems based on well-known riddles are presented, demonstrating that DDRQN can successfully solve such tasks and discover elegant communication protocols to do so, the first time deep reinforcement learning has succeeded in learning communication protocols.

Planning, Learning and Coordination in Multiagent Decision Processes

The extent to which methods from single-agent planning and learning can be applied in multiagent settings is investigated and the decomposition of sequential decision processes so that coordination can be learned locally, at the level of individual states.

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Multi-agent reinforcement learning as a rehearsal for decentralized planning

A Comprehensive Survey of Multiagent Reinforcement Learning

The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.

Multiagent Learning through Neuroevolution

Recent progress in neuroevolution, accelerating evolution through social learning, and measuring how good the resulting solutions are are are reviewed, and avenues for future work are suggested.

Counterfactual Multi-Agent Policy Gradients

A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients that uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

Optimal and Approximate Q-value Functions for Decentralized POMDPs

This paper studies whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions, and describes a family of algorithms for extracting policies from such Q- value functions.