Multi-Agent Generative Adversarial Imitation Learning
@inproceedings{Song2018MultiAgentGA, title={Multi-Agent Generative Adversarial Imitation Learning}, author={Jiaming Song and Hongyu Ren and Dorsa Sadigh and Stefano Ermon}, booktitle={NeurIPS}, year={2018} }
Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal. However, most existing approaches are not applicable in multi-agent settings due to the existence of multiple (Nash) equilibria and non-stationary environments. We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We further introduce a practical multi-agent actor…
97 Citations
Multi-Agent Adversarial Inverse Reinforcement Learning
- Computer ScienceICML
- 2019
MA-AIRL is proposed, a new framework for multi-agent inverse reinforcement learning, which is effective and scalable for Markov games with high-dimensional state-action space and unknown dynamics, and significantly outperforms prior methods in terms of policy imitation.
Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems
- Computer ScienceAAMAS
- 2019
This work is the first to combine self-imitation learning with generative adversarial imitation learning (GAIL) and apply it to cooperative multiagent systems and produces state-of-the-art results and even outperforms JALs in terms of both convergence speed and final performance.
Imitation Learning From Inconcurrent Multi-Agent Interactions
- Computer Science2021 60th IEEE Conference on Decision and Control (CDC)
- 2021
The experiment results demonstrate that compared to state-of-the-art baselines, the iMA-IL model can better infer the policy of each expert agent using their demonstration data collected from inconcurrent decision-making scenarios.
Conditional Imitation Learning for Multi-Agent Games
- Computer ScienceArXiv
- 2022
A model that learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace, and proposes a novel approach to address the difficulties of scalability and data scarcity.
2 . 3 Inverse Reinforcement Learning and Imitation Learning
- Computer Science
- 2019
Two methods which apply forms of imitation learning to the problem of learning coordinated behaviors have a close connection to multiagent actor-critic models, and will avoid relative overgeneralization if the right demonstrations are given.
Sample-efficient Adversarial Imitation Learning from Observation
- Computer ScienceArXiv
- 2019
An algorithm is proposed that addresses the sample inefficiency problem by utilizing ideas from trajectory centric reinforcement learning algorithms and will show the improvement in learning rate and efficiency.
Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic
- Computer ScienceArXiv
- 2020
A multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works, and able to increase sample efficiency compared to state-of-the-art baselines, across both small- and large-scale tasks.
Multi-Agent Imitation Learning with Copulas
- Computer ScienceECML/PKDD
- 2021
The proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
Simulating Emergent Properties of Human Driving Behavior Using Multi-Agent Reward Augmented Imitation Learning
- Computer Science2019 International Conference on Robotics and Automation (ICRA)
- 2019
It is proved that convergence guarantees for the imitation learning process are preserved under the application of reward augmentation, and improved performance is demonstrated in comparison to traditional imitation learning algorithms both in terms of the local actions of a single agent and the behavior of emergent properties in complex, multi-agent settings.
Multi-Agent Imitation Learning for Driving Simulation
- Computer Science2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018
Compared with single-agent GAIL policies, policies generated by the PS-GAIL method prove superior at interacting stably in a multi-agent setting and capturing the emergent behavior of human drivers.
References
SHOWING 1-10 OF 77 REFERENCES
Generative Adversarial Imitation Learning
- Computer ScienceNIPS
- 2016
A new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning, is proposed and a certain instantiation of this framework draws an analogy between imitation learning and generative adversarial networks.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
- Computer ScienceNIPS
- 2017
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Coordinated Multi-Agent Imitation Learning
- Computer ScienceICML
- 2017
It is shown that having a coordination model to infer the roles of players yields substantially improved imitation loss compared to conventional baselines, and the method integrates unsupervised structure learning with conventional imitation learning.
Inverse Reinforcement Learning in Swarm Systems
- Computer Science, MathematicsAAMAS
- 2017
This paper introduces the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization, and proposes a novel heterogeneous learning scheme that is particularly tailored to the swarm setting.
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
- Computer ScienceNIPS
- 2016
By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability.
Continuous control with deep reinforcement learning
- Computer ScienceICLR
- 2016
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Third-Person Imitation Learning
- Computer Science, EducationICLR
- 2017
The methods primary insight is that recent advances from domain confusion can be utilized to yield domain agnostic features which are crucial during the training process.
Efficient Reductions for Imitation Learning
- Computer ScienceAISTATS
- 2010
This work proposes two alternative algorithms for imitation learning where training occurs over several episodes of interaction and shows that this leads to stronger performance guarantees and improved performance on two challenging problems: training a learner to play a 3D racing game and Mario Bros.
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
- Computer ScienceAISTATS
- 2011
This paper proposes a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting and demonstrates that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.
Maximum Entropy Inverse Reinforcement Learning
- Computer ScienceAAAI
- 2008
A probabilistic approach based on the principle of maximum entropy that provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods is developed.