Grandmaster level in StarCraft II using multi-agent reinforcement learning

  title={Grandmaster level in StarCraft II using multi-agent reinforcement learning},
  author={Oriol Vinyals and Igor Babuschkin and Wojciech M. Czarnecki and Micha{\"e}l Mathieu and Andrew Dudzik and Junyoung Chung and David H. Choi and Richard Powell and Timo Ewalds and Petko Georgiev and Junhyuk Oh and Dan Horgan and Manuel Kroiss and Ivo Danihelka and Aja Huang and L. Sifre and Trevor Cai and John P. Agapiou and Max Jaderberg and Alexander Sasha Vezhnevets and R{\'e}mi Leblond and Tobias Pohlen and Valentin Dalibard and David Budden and Yury Sulsky and James Molloy and Tom Le Paine and Caglar Gulcehre and Ziyun Wang and Tobias Pfaff and Yuhuai Wu and Roman Ring and Dani Yogatama and Dario W{\"u}nsch and Katrina McKinney and Oliver Smith and Tom Schaul and Timothy P. Lillicrap and Koray Kavukcuoglu and Demis Hassabis and Chris Apps and David Silver},
Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions1–3… 

SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II

A deep reinforcement learning agent, StarCraft Commander (SCC), is proposed with order of magnitude less computation, which demonstrates top human performance defeating GrandMaster players in test matches and top professional players in a live event.

Mastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning

This work proposes a rule-enhanced multi-agent reinforcement learning algorithm and builds a 3v3 Snakes AI, which achieves state-of-the-art performance and beats human players.

Applying supervised and reinforcement learning methods to create neural-network-based agents for playing StarCraft II

A neural network architecture for playing the full two-player match of StarCraft II trained with general-purpose supervised and reinforcement learning, that can be trained on a single consumer-grade PC with a single GPU and achieves a non-trivial performance when compared to the in-game scripted bots.

Multi-Agent Collaboration via Reward Attribution Decomposition

CollaQ is proposed, a Collaborative Q-learning that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play and outperforms previous SoTA by over 30%.

Missouri State University

  • A. Harris
  • Computer Science
    The Grants Register 2022
  • 2019
This thesis designs a robust DenseNet-style actor-critic structured deep neural network for controlling multiple agents based upon the combination of local observations and abstracted global information to compete with opponent agents and promotes a thorough understanding of the potential influence that a unit has without the need for a complete view of the global state.

MAIDRL: Semi-centralized Multi-Agent Reinforcement Learning using Agent Influence

A novel semi-centralized deep reinforcement learning algorithm for mixed cooperative and competitive multi-agent environments with robust DenseNet-style actor-critic structured deep neural network for controlling multiple agents based on the combination of local observation and abstracted global information to compete with opponent agents.

TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios.

Large Scale Deep Reinforcement Learning in War-games

A hierarchical multi-agent reinforcement learning framework to rapidly training an AI model for the war-game based on hexagon grids is proposed and results show that the hierarchical structure allows agents to learn their strategies effectively.

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.

An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective

This work provides a self-contained assessment of the current state-of-the-art MARL techniques from a game theoretical perspective and expects this work to serve as a stepping stone for both new researchers who are about to enter this fast-growing domain and existing domain experts who want to obtain a panoramic view and identify new directions based on recent advances.



StarCraft II: A New Challenge for Reinforcement Learning

This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game that offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures and gives initial baseline results for neural networks trained from this data to predict game outcomes and player actions.

StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning

An efficient state representation is defined, which breaks down the complexity caused by the large state space in the game environment, and a parameter sharing multi-agent gradient-descent Sarsa($\lambda$) algorithm is proposed to train the units.

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

An algorithm is described, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection, which generalizes previous ones such as InRL.

Human-level performance in 3D multiplayer games with population-based reinforcement learning

A tournament-style evaluation is used to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input.

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

A heuristic reinforcement learning algorithm which combines direct exploration in the policy space and backpropagation and allows for the collection of traces for learning using deterministic policies, which appears much more efficient than, for example, {\epsilon}-greedy exploration.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning

This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation.

An Analysis of Model-Based Heuristic Search Techniques for StarCraft Combat Scenarios

This paper presents the first integration of PGS into the StarCraft game engine, and compares its performance to the current state-of-the-art deep reinforcement learning method in several benchmark combat scenarios, and explores possible issues related to its reliance on an abstract simulator as a forward model.

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

This paper generalizes the AlphaZero approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games, and convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.