• Corpus ID: 233289548

Joint Attention for Multi-Agent Coordination and Social Learning

@article{Lee2021JointAF,
  title={Joint Attention for Multi-Agent Coordination and Social Learning},
  author={Dennis Lee and Natasha Jaques and Chase Kew and Douglas Eck and Dale Schuurmans and Aleksandra Faust},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.07750}
}
Joint attention—the ability to purposefully coordinate attention with another agent, and mutually attend to the same thing—is a critical component of human social cognition. In this paper, we ask whether joint attention can be useful as a mechanism for improving multi-agent coordination and social learning. We first develop deep reinforcement learning (RL) agents with a recurrent visual attention architecture. We then train agents to minimize the difference between the attention weights that… 

Figures from this paper

Interpretability for Conditional Coordinated Behavior in Multi-Agent Reinforcement Learning

By visualizing the attention weights from DA6-X, it is confirmed that agents successfully learn situation-dependent coordinated behaviors by correctly identifying various conditional states, leading to improved interpretability of agents along with superior performance.

Social Neuro AI: Social Interaction as the “Dark Matter” of AI

It is made the case that empirical results from social psychology and social neuroscience along with the framework of dynamics can be of inspiration to the development of more intelligent artificial agents and provide a new perspective on the field of multiagent robot systems, exploring how it can advance by following the aforementioned three axes.

Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning With Shapley Values

It is argued that Shapley values are a pertinent way to evaluate the contribution of players in a cooperative multi-agent RL context and cannot explain a single run, episode nor justify precise actions taken by agents.

Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Long Short-Term Memory

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

Empirical results demonstrate that influence leads to enhanced coordination and communication in challenging social dilemma environments, dramatically increasing the learning curves of the deep RL agents, and leading to more meaningful learned communication protocols.

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion, is evaluated on a challenging set of SMAC scenarios and it significantly outperforms existing multi-agent reinforcement learning methods.

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

This model uses a soft, top-down attention mechanism to create a bottleneck in the agent, forcing it to focus on task-relevant information by sequentially querying its view of the environment.

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.

Neuroevolution of selfinterpretable agents

    arXiv preprint arXiv:2003.08165,
  • 2020

Learning Social Learning

The results indicate that social learning can enable RL agents to not only improve performance on the task at hand, but improve generalization to novel environments.

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

It is demonstrated that Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning.
...