Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning

@article{Pleines2020ObstacleTW,
  title={Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning},
  author={Marco Pleines and Jenia Jitsev and Mike Preuss and Frank Zimmer},
  journal={2020 IEEE Conference on Games (CoG)},
  year={2020},
  pages={447-454}
}
The Obstacle Tower Challenge is the task to master a procedurally generated chain of levels that subsequently get harder to complete. Whereas the most top performing entries of last year’s competition used human demonstrations or reward shaping to learn how to cope with the challenge, we present an approach that performed competitively (placed 7th) but starts completely from scratch by means of Deep Reinforcement Learning with a relatively simple feed-forward deep network structure. We… 

Figures and Tables from this paper

LevDoom: A Benchmark for Generalization on Level Difficulty in Reinforcement Learning

TLDR
The LevDoom benchmark is introduced, a suite containing semi-realistic 3D simulation environments with coherent levels of difficulty in the renowned video game Doom, designed to benchmark generalization in vision-based RL.

Learning sparse and meaningful representations through embodiment

References

SHOWING 1-10 OF 51 REFERENCES

and O

  • Klimov, “Proximal policy optimization algorithms,” arXiv:1707.06347
  • 2017

Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning

TLDR
The environment is outlined and a set of baseline results produced by current state-of-the-art Deep RL methods as well as human players are provided; these algorithms fail to produce agents capable of performing near human level.

and a at

The xishacorene natural products are structurally unique apolar diterpenoids that feature a bicyclo[3.3.1] framework. These secondary metabolites likely arise from the well-studied, structurally

Dota 2 with Large Scale Deep Reinforcement Learning

TLDR
By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.

Leveraging Procedural Generation to Benchmark Reinforcement Learning

TLDR
This work empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation and uses this benchmark to investigate the effects of scaling model size.

and s

Grandmaster level in StarCraft II using multi-agent reinforcement learning

TLDR
The agent, AlphaStar, is evaluated, which uses a multi-agent reinforcement learning algorithm and has reached Grandmaster level, ranking among the top 0.2% of human players for the real-time strategy game StarCraft II.

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference

TLDR
A modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL), which is able to train on millions of frames per second and lower the cost of experiments compared to current methods with a simple architecture.

Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning

TLDR
A simple technique to improve a generalization ability of deep RL agents by introducing a randomized (convolutional) neural network that randomly perturbs input observations, which enables trained agents to adapt to new domains by learning robust features invariant across varied and randomized environments.

Generalization of Reinforcement Learners with Working and Episodic Memory

TLDR
This paper develops a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that are relevant for evaluating memory-specific generalization.
...