• Publications
  • Influence
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TLDR
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Learning with Opponent-Learning Awareness
TLDR
Results show that the encounter of two LOLA agents leads to the emergence of tit-for-tat and therefore cooperation in the iterated prisoners' dilemma, while independent learning does not, and LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods.
Decision Transformer: Reinforcement Learning via Sequence Modeling
TLDR
Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
Emergence of Grounded Compositional Language in Multi-Agent Populations
TLDR
This paper proposes a multi-agent learning environment and learning methods that bring about emergence of a basic compositional language that is represented as streams of abstract discrete symbols uttered by agents over time, but nonetheless has a coherent structure that possesses a defined vocabulary and syntax.
Emergent Tool Use From Multi-Agent Autocurricula
TLDR
This work finds clear evidence of six emergent phases in agent strategy in the authors' environment, each of which creates a new pressure for the opposing team to adapt, and compares hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests.
Implicit Generation and Modeling with Energy Based Models
TLDR
This work presents techniques to scale MCMC based EBM training on continuous neural networks, and shows its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches.
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments
TLDR
A simple gradient-based meta-learning algorithm suitable for adaptation in dynamically changing and adversarial scenarios is developed and demonstrated that meta- learning enables significantly more efficient adaptation than reactive baselines in the few-shot regime.
Implicit Generation and Generalization in Energy-Based Models
TLDR
This work presents techniques to scale MCMC based EBM training on continuous neural networks, and shows its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches.
Emergent Complexity via Multi-Agent Competition
TLDR
This work introduces several competitive multi-agent environments where agents compete in a 3D world with simulated physics and points out that such environments come with a natural curriculum, because for any skill level, an environment full of agents of this level will have the right level of difficulty.
Discovery of complex behaviors through contact-invariant optimization
TLDR
A motion synthesis framework capable of producing a wide variety of important human behaviors that have rarely been studied, including getting up from the ground, crawling, climbing, moving heavy objects, acrobatics, and various cooperative actions involving two characters and their manipulation of the environment is presented.
...
...