A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

@article{Jain2020ACS,
  title={A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks},
  author={Unnat Jain and Luca Weihs and Eric Kolve and Ali Farhadi and Svetlana Lazebnik and Aniruddha Kembhavi and Alexander G. Schwing},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.04979}
}
Autonomous agents must learn to collaborate. It is not scalable to develop a new centralized agent every time a task's difficulty outpaces a single agent's abilities. While multi-agent collaboration research has flourished in gridworld-like environments, relatively little work has considered visually rich domains. Addressing this, we introduce the novel task FurnMove in which agents work together to move a piece of furniture through a living room to a goal. Unlike existing tasks, FurnMove… 

E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance

E-MAPP is introduced, a novel framework that leverages parallel programs to guide multiple agents to efficiently accomplish goals that require planning over 10+ stages, and outperforms strong baselines in terms of the completion rate, time ef-magnifying ability, and zero-shot generalization ability.

Embodied Multi-Agent Task Planning from Ambiguous Instruction

An embodied multi-agent task planning framework is proposed to utilize external knowledge sources and dynamically perceived visual information to resolve the high-level instructions, and dynamically allocate the decomposed tasks to multiple agents and generate sub-goals to achieve the navigation motion.

Collaborative Visual Navigation

This work proposes a large-scale 3D dataset, CollaVN, for multi-agent visual navigation (MAVN), and proposes a memory-augmented communication framework that allows agents to make better use of their past communication information, enabling more efficient collaboration and robust long-term planning.

ASC me to Do Anything: Multi-task Training for Embodied AI

Atomic Skill Completion (ASC) is proposed, an approach for multi-task training for Embodied AI, where a set of atomic skills shared across multiple tasks are composed together to perform the tasks.

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

It is shown that agent coordination improves through expectation alignment because agents learn to divide tasks amongst themselves, break coordination symmetries, and confuse adversaries, enabling agents to do zero-shot coordination.

GridToPix: Training Embodied Agents with Minimal Supervision

GRIDTOPIX is proposed to train agents with terminal rewards in gridworlds that generically mirror Embodied AI environments, i.e., they are independent of the task; 2) distill the learned policy into agents that reside in complex visual worlds.

AllenAct: A Framework for Embodied AI Research

AllenAct is introduced, a modular and flexible learning framework designed with a focus on the unique requirements of Embodied AI research that provides first-class support for a growing collection of embodied environments, tasks and algorithms.

Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning

This method is the first to successfully learn a strategy for five players in the GFootball environment using only visual data, and ‘Semantic tracklets’ consistently outperform baselines on VMPE, and achieve a +2.4 higher score difference than baseline on GFootball.

CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

This work proposes a multimodal (vision-and-language) benchmark for cooperative and heterogeneous multi-agent learning, and proposes and implements a message passing interface between agents to enable information sharing in decentralized model setups where they would otherwise not have the ability to collabo-rate with each other.

Shaping embodied agent behavior with activity-context priors from egocentric video

This work introduces an approach to discover activitycontext priors from in-the-wild egocentric video captured with human worn cameras, encoding the video-based prior as an auxiliary reward function that encourages an agent to bring compatible objects together before attempting an interaction.

References

SHOWING 1-10 OF 131 REFERENCES

Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games

This paper introduces a Multiagent Bidirectionally-Coordinated Network (BiCNet) with a vectorised extension of actor-critic formulation and demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of advanced coordination strategies that have been commonly used by experienced game players.

TarMAC: Targeted Multi-Agent Communication

This work proposes a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments, and augment this with a multi-round communication approach.

Counterfactual Multi-Agent Policy Gradients

A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients that uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

GridToPix: Training Embodied Agents with Minimal Supervision

GRIDTOPIX is proposed to train agents with terminal rewards in gridworlds that generically mirror Embodied AI environments, i.e., they are independent of the task; 2) distill the learned policy into agents that reside in complex visual worlds.

AllenAct: A Framework for Embodied AI Research

AllenAct is introduced, a modular and flexible learning framework designed with a focus on the unique requirements of Embodied AI research that provides first-class support for a growing collection of embodied environments, tasks and algorithms.

Cooperative Multi-Agent Learning: The State of the Art

This survey attempts to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics, and finds that this broad view leads to a division of the work into two categories.

Neural Modular Control for Embodied Question Answering

This work uses imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning, for learning policies for navigation over long planning horizons from language input.

Two Body Problem: Collaborative Visual Task Completion

This paper studies the problem of learning to collaborate directly from pixels in AI2-THOR and demonstrates the benefits of explicit and implicit modes of communication to perform visual tasks.

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.

Learning Multiagent Communication with Backpropagation

A simple neural model is explored, called CommNet, that uses continuous communication for fully cooperative tasks and the ability of the agents to learn to communicate amongst themselves is demonstrated, yielding improved performance over non-communicative agents and baselines.
...