Combinatorial Multi-armed Bandits for Real-Time Strategy Games

@article{Ontan2017CombinatorialMB,
  title={Combinatorial Multi-armed Bandits for Real-Time Strategy Games},
  author={Santiago Onta{\~n}{\'o}n},
  journal={ArXiv},
  year={2017},
  volume={abs/1710.04805}
}
Games with large branching factors pose a significant challenge for game tree search algorithms. [...] Key Method We analyze the theoretical properties of several variants of naive sampling, and empirically compare it against the other existing strategies in the literature for CMABs. We then evaluate these strategies in the context of real-time strategy (RTS) games, a genre of computer games characterized by their very large branching factors. Our results show that as the branching factor grows, naive sampling…Expand
Action Abstractions for Combinatorial Multi-Armed Bandit Tree Search
TLDR
Two CMAB-based search algorithms that use action abstraction schemes to reduce the action space considered during search are introduced, and regular action abstractions A1N, A2N, and A3N are able to outperform all state-of-the-art search algorithms tested. Expand
MCTS Pruning in Turn-Based Strategy Games
TLDR
This paper studies pruning techniques and the insertion of domain knowledge to deal with high branching factors in a new turn-based strategy game: Tribes and shows that MCTS can increase its performance and outperform the rule-based agents and Rolling Horizon Evolutionary Algorithms. Expand
Improving the Performance of MCTS-Based µRTS Agents Through Move Pruning
TLDR
This paper proposes to employ move pruning as a way to improve the performance of MCTS-based agents in the context of RTS games and describes a class of possibly detrimental player-actions and proposes several pruning approaches targeting it. Expand
On-Line Parameter Tuning for Monte-Carlo Tree Search in General Game Playing
TLDR
This paper proposes a method to automatically tune search-control parameters on-line for GGP, and considers the tuning problem as a Combinatorial Multi-Armed Bandit (CMAB). Expand
Self-Adaptive Monte Carlo Tree Search in General Game Playing
TLDR
A self-adaptive MCTS strategy (SA-MCTS) that integrates within the search a method to automatically tune search-control parameters online per game and presents five different allocation strategies that decide how to allocate available samples to evaluate parameter values. Expand
Player Modeling via Multi-Armed Bandits
TLDR
A novel approach to player modeling based on multi-armed bandits (MABs) is presented and an approach to evaluating and fine-tuning these algorithms prior to generating data in a user study is presented. Expand
Extracting Policies from Replays to Improve MCTS in Real Time Strategy Games
TLDR
This paper focuses on learning a tree policy for MCTS using existing supervised learning algorithms, evaluating and comparing two families of models: Bayesian classifiers and decision trees classifiers. Expand
Intelligent Adjustment of Game Properties at Run Time Using Multi-armed Bandits
TLDR
A technique based on the multi-armed bandit (MAB) approach for intelligent and dynamic theme selection in a video game is proposed and has the potential of being used as a toolkit for determining the preferences of players at real-time. Expand
When Gaussian Processes Meet Combinatorial Bandits : GCB
Combinatorial bandits (CMAB) are a generalization of the well-known Multi-Armed Bandit framework, in which the learner chooses, at each round, a subset of the available arms that satisfies some knownExpand
Search, Abstractions and Learning in Real-Time Strategy Games
TLDR
This work decomposes the game into sub-problems and integrates the partial solutions into action scripts that can be used as abstract actions by a search or machine learning algorithm to produce sound strategic choices. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games
TLDR
This paper presents a new MCTS algorithm based on Naive Sampling called NaiveMCTS, and evaluates it in the context of real-time strategy (RTS) games, showing that as the branching factor grows, NaiveCMAB performs significantly better than other algorithms. Expand
Informed Monte Carlo Tree Search for Real-Time Strategy games
  • Santiago Ontañón
  • Computer Science
  • 2016 IEEE Conference on Computational Intelligence and Games (CIG)
  • 2016
TLDR
This paper studies the use of Bayesian models to estimate the probability distribution of actions played by a strong player, and the incorporation of such models into NaiveMCTS, a MCTS algorithm designed for games with combinatorial branching factors. Expand
On Combinatorial Actions and CMABs with Linear Side Information
TLDR
A novel CMAB planning scheme is proposed, as well as two specific instances of this scheme, dedicated to exploiting what is called linear side information, and it is shown that the resulting algorithms very favorably compete with the state-of-the-art. Expand
Portfolio greedy search and simulation for large-scale combat in starcraft
TLDR
This paper presents an efficient system for modelling abstract RTS combat called SparCraft, which can perform millions of unit actions per second and visualize them, and presents a modification of the UCT algorithm capable of performing search in games with simultaneous and durative actions. Expand
Almost Optimal Exploration in Multi-Armed Bandits
TLDR
Two novel, parameter-free algorithms for identifying the best arm, in two different settings: given a target confidence and given atarget budget of arm pulls, are presented, for which upper bounds whose gap from the lower bound is only doubly-logarithmic in the problem parameters are proved. Expand
Game-Tree Search over High-Level Game States in RTS Games
TLDR
A high-level abstract representation of the game state is proposed, that significantly reduces the branching factor when used for game-tree search algorithms and is evaluated in the context of StarCraft. Expand
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
TLDR
The focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Expand
Alpha-Beta Pruning for Games with Simultaneous Moves
TLDR
This paper introduces an Alpha-Beta-like sound pruning method for the more general class of "stacked matrix games" that allow for simultaneous moves by both players that shows considerable savings in terms of expanded nodes compared to naive depth-first move computation without pruning. Expand
Sequential Halving Applied to Trees
  • T. Cazenave
  • Computer Science
  • IEEE Transactions on Computational Intelligence and AI in Games
  • 2015
TLDR
This work proposes an alternative MCTS algorithm: sequential halving applied to Trees (SHOT), which has multiple advantages over UCT: it spends less time in the tree, it uses less memory, it is parameter free, at equal time settings it beats UCT for a complex combinatorial game and it can be efficiently parallelized. Expand
MCTS Based on Simple Regret
TLDR
A sampling scheme that is "aware" of VOI is proposed for MCTS, achieving an algorithm that in empirical evaluation outperforms both UCT and the other proposed algorithms. Expand
...
1
2
3
4
5
...