• Corpus ID: 244714745

Final Adaptation Reinforcement Learning for N-Player Games

@article{Konen2021FinalAR,
  title={Final Adaptation Reinforcement Learning for N-Player Games},
  author={W. Konen and Samineh Bagheri},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.14375}
}
This paper covers n-tuple-based reinforcement learning (RL) algorithms for games. We present new algorithms for TD-, SARSAand Q-learning which work seamlessly on various games with arbitrary number of players. This is achieved by taking a player-centered view where each player propagates his/her rewards back to previous rounds. We add a new element called Final Adaptation RL (FARL) to all these algorithms. Our main contribution is that FARL is a vitally important ingredient to achieve success… 

Figures and Tables from this paper

Reinforcement Learning for N-player Games: The Importance of Final Adaptation

This paper presents a new algorithm for temporal difference (TD) learning which works seamlessly on various games with arbitrary number of players, and adds a new element called Final Adaptation RL (FARL) to this algorithm.

General Board Game Playing Framework

An important element of AlphaZero – the Monte Carlo Tree Search (MCTS) planning stage – is picked and combined with temporal difference (TD) learning agents to create versatile agents that keep at the same time the computational demands low.

Bioinspired Optimization Methods and Their Applications: 9th International Conference, BIOMA 2020, Brussels, Belgium, November 19–20, 2020, Proceedings

The real-world problems studied were easier to solve than the synthetic ones, and the analysis reveals why; they have easier to traverse global structures with fewer nodes and edges, no sub-optimal funnels, higher neutrality and multiple global optima with shorter trajectories towards them.

References

SHOWING 1-10 OF 19 REFERENCES

Online Adaptable Learning Rates for the Game Connect-4

This work investigates different approaches of online-adaptable learning rates like Incremental Delta Bar Delta (IDBD) or temporal coherence learning (TCL) whether they have the potential to speed up learning for such a complex task as Connect-4 and proposes a new variant of TCL with geometric step size changes.

Reinforcement Learning for Board Games: The Temporal Dierence Algorithm

This technical report shows how the ideas of reinforcement learning (RL) and temporal dierence (TD) learning can be applied to board games. This report collects the main ideas from Sutton and Barto

Temporal difference learning with eligibility traces for the game connect four

It is shown that eligibility traces speed up the learning by a factor of two and that they increase the asymptotic playing strength.

High-Dimensional Function Approximation for Knowledge-Free Reinforcement Learning: a Case Study in SZ-Tetris

It is shown that a large systematic n-tuple network allows the classical temporal difference learning algorithm to obtain similar average performance to VD-CMA-ES, but at 20 times lower computational expense, leading to the best policy for SZ-Tetris known to date.

Reinforcement Learning: An Introduction

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Mastering the game of Go without human knowledge

An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

Temporal difference learning of N-tuple networks for the game 2048

The conducted experiments demonstrate that the learning algorithm using afterstate value functions is able to consistently produce players winning over 97% of games, and show that n-tuple networks combined with an appropriate learning algorithm have large potential, which could be exploited in other board games.

An Agent for EinStein Würfelt Nicht! Using N-Tuple Networks

The experimental results show that ε-greedy improved the playing strength the most, which obtained a win rate of 61.05% against the baseline agent, and the enhanced agent tournament in Computer Olympaid 2017.

General Board Game Playing for Education and Research in Generic AI Game Learning

  • W. Konen
  • Computer Science
    2019 IEEE Conference on Games (CoG)
  • 2019
A new general board game (GBG) playing and learning framework that makes a generic TD(λ)-n-tuple agent for the first time available to arbitrary games and helps students to start faster in the area of game learning.

Multi-Player Alpha-Beta Pruning

  • R. Korf
  • Computer Science
    Artif. Intell.
  • 1991