• Corpus ID: 222208790

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

@article{Huang2020ActionGG,
  title={Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games},
  author={Shengyi Huang and Santiago Ontan'on},
  journal={ArXiv},
  year={2020},
  volume={abs/2010.03956}
}
Training agents using Reinforcement Learning in games with sparse rewards is a challenging problem, since large amounts of exploration are required to retrieve even the first reward. To tackle this problem, a common approach is to use reward shaping to help exploration. However, an important drawback of reward shaping is that agents sometimes learn to optimize the shaped reward instead of the true objective. In this paper, we present a novel technique that we call action guidance that… 
5 Citations

Figures and Tables from this paper

Policy Fusion for Adaptive and Customizable Reinforcement Learning Agents
TLDR
This article proposes four different policy fusion methods for combining pre-trained policies and demonstrates how these methods can be used in combination with Inverse Reinforcement Learning in order to create intelligent agents with specific behavioral styles as chosen by game designers, without having to define many and possibly poorly-designed reward functions.
MAIDRL: Semi-centralized Multi-Agent Reinforcement Learning using Agent Influence
TLDR
A novel semi-centralized deep reinforcement learning algorithm for mixed cooperative and competitive multi-agent environments with robust DenseNet-style actor-critic structured deep neural network for controlling multiple agents based on the combination of local observation and abstracted global information to compete with opponent agents.
MARL-Based Dual Reward Model on Segmented Actions for Multiple Mobile Robots in Automated Warehouse Environment
The simple and labor-intensive tasks of workers on the job site are rapidly becoming digital. In the work environment of logistics warehouses and manufacturing plants, moving goods to a designated
Transfer Dynamics in Emergent Evolutionary Curricula
TLDR
The main question addressed is how the open-ended learning actually works, focusing in particular on the role of transfer of policies from one evolutionary branch (“species”) to another, and the most insightful finding is that inter-species transfer is crucial to the system’s success.

References

SHOWING 1-10 OF 37 REFERENCES
Learning by Playing - Solving Sparse Reward Tasks from Scratch
TLDR
The key idea behind the method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL.
Action Space Shaping in Deep Reinforcement Learning
TLDR
The results show how domain-specific removal of actions and discretization of continuous actions can be crucial for successful learning.
Hindsight Experience Replay
TLDR
A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum.
Mastering Complex Control in MOBA Games with Deep Reinforcement Learning
TLDR
A deep reinforcement learning framework to tackle the problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games is presented, which is of low coupling and high scalability, which enables efficient explorations at large scale.
Large-Scale Study of Curiosity-Driven Learning
TLDR
This paper performs the first large-scale study of purely curiosity-driven learning, i.e. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite, and shows surprisingly good performance.
On Reinforcement Learning for Full-length Game of StarCraft
TLDR
A hierarchical approach, where the hierarchy involves two levels of abstraction, which can reduce the action space in an order of magnitude yet remain effective and a curriculum transfer learning approach that trains the agent from the simplest opponent to harder ones.
StarCraft II: A New Challenge for Reinforcement Learning
TLDR
This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game that offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures and gives initial baseline results for neural networks trained from this data to predict game outcomes and player actions.
Comparing Observation and Action Representations for Deep Reinforcement Learning in MicroRTS
TLDR
A preliminary study comparing different observation and action space representations for Deep Reinforcement Learning (DRL) in the context of Real-time Strategy (RTS) games shows that the local representation seems to outperform the global representation when training agents with the task of harvesting resources.
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
TLDR
Conditions under which modi cations to the reward function of a Markov decision process preserve the op timal policy are investigated to shed light on the practice of reward shap ing a method used in reinforcement learn ing whereby additional training rewards are used to guide the learning agent.
Apprenticeship learning via inverse reinforcement learning
TLDR
This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function.
...
...