• Corpus ID: 194604

Frame Skip Is a Powerful Parameter for Learning to Play Atari

@inproceedings{Braylan2015FrameSI,
  title={Frame Skip Is a Powerful Parameter for Learning to Play Atari},
  author={Alexander Braylan and Mark Hollenbeck and Elliot Meyerson and Risto Miikkulainen},
  booktitle={AAAI Workshop: Learning for General Competency in Video Games},
  year={2015}
}
We show that setting a reasonable frame skip can be critical to the performance of agents learning to play Atari 2600 games. In all of the six games in our experiments, frame skip is a strong determinant of success. For two of these games, setting a large frame skip leads to state-of-the-art performance. The rate at which an agent interacts with its environment may be critical to its success. In the Arcade Learning Environment (ALE) (Bellemare et al. 2013) games run at sixty frames per second… 

Figures and Tables from this paper

Dynamic Frame skip Deep Q Network
TLDR
This paper proposes a new architecture, Dynamic Frame skip Deep Q-Network (DFDQN) which makes the frame skip rate a dynamic learnable parameter, and shows empirically that such a setting improves the performance on relatively harder games like Seaquest.
An Analysis of Frame-skipping in Reinforcement Learning
TLDR
This paper investigates the role of the parameter 𝑑 in RL, called the “frame-skip” parameter, and defines a task-dependent quantity called the "price of inertia”, in terms of which the loss incurred by action-repetition is upper-bound by the gain brought to learning by a smaller task horizon.
Utilizing Skipped Frames in Action Repeats via Pseudo-Actions
TLDR
The key idea of the method is making the transition between action-decision points usable as training data by considering pseudoactions, and it can be combined with any model-free reinforcement learning algorithm that involves the learning of Q-functions.
Learning and Generalization in Atari Games
TLDR
This thesis describes the design of agents that learn to play Atari games using the Arcade Learning Environment (ALE) framework to interact with them and shows how well an agent learns from multiple games simultaneously.
TempoRL: Learning When to Act
TLDR
This work proposes a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action, and introduces skip connections between states and learns a skip-policy for repeating the same action along these skips.
Dynamic Action Repetition for Deep Reinforcement Learning
TLDR
A new framework is proposed which changes Action Repetition Rate (the time scale of repeating an action) from a hyper-parameter of an algorithm to a dynamically learnable quantity and it is shown empirically that such a dynamic time scale mechanism improves the performance on relatively harder games in the Atari 2600 domain, independent of the underlying Deep Reinforcement Learning algorithm used.
MACRO ACTION ENSEMBLE SEARCHING METHODOL-
  • Computer Science
  • 2019
TLDR
The proposed method is inspired by the concepts of neural architecture search techniques, which are capable of developing network architectures for different tasks and able to search finite macro action ensemble spaces directly, that other contemporary methods have yet to achieve.
The Atari Grand Challenge Dataset
TLDR
A large dataset of human Atari 2600 replays is collected and described -- the largest and most diverse such data set publicly released to date -- and possible research directions are outlined that are opened up by this work.
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
TLDR
This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.
NEAT for large-scale reinforcement learning through evolutionary feature learning and policy gradient search
TLDR
A new reinforcement learning scheme based on NEAT is proposed with two key technical advancements: a new three-stage learning scheme is introduced to clearly separate feature learning and policy learning to allow effective knowledge sharing and learning across multiple agents.
...
...

References

SHOWING 1-8 OF 8 REFERENCES
The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
TLDR
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed.
A Neuroevolution Approach to General Atari Game Playing
TLDR
Results suggest that neuroevolution is a promising approach to general video game playing (GVGP) and achieved state-of-the-art results, even surpassing human high scores on three games.
Playing Atari with Deep Reinforcement Learning
TLDR
This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Incremental Evolution of Complex General Behavior
TLDR
This article proposes an approach wherein complex general behavior is learned incrementally, by starting with simpler behavior and gradually making the task more challenging and general, which evolves more effective and more general behavior.
Co-evolving recurrent neurons learn deep memory POMDPs
TLDR
A new neuroevolution algorithm called Hierarchical Enforced SubPopulations that simultaneously evolves networks at two levels of granularity: full networks and network components or neurons is introduced.
Training Recurrent Networks by Evolino
TLDR
It is shown that Evolino-based LSTM can solve tasks that Echo State nets cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-basedLSTM.
Temporal Abstraction in Monte Carlo Tree Search
Training recurrent networks by evolino Temporal abstraction in monte carlo tree search