Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search

@inproceedings{Coulom2006EfficientSA,
  title={Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search},
  author={R{\'e}mi Coulom},
  booktitle={Computers and Games},
  year={2006}
}
  • Rémi Coulom
  • Published in Computers and Games 29 May 2006
  • Computer Science
A Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations. The method can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte-Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that… 
Monte-Carlo Tree Search
TLDR
The most promising results so far have been obtained in the game of Go, in which it outperformed all classic techniques, and therefore Go is used as the main test domain.
Monte-Carlo Expression Discovery
  • T. Cazenave
  • Computer Science
    Int. J. Artif. Intell. Tools
  • 2013
TLDR
The proposed approach to Monte-Carlo Tree Search is simple to program, does not suffer from ex-pression growth, has a natural restart strategy to avoid local optima and is extremely easy to parallelize.
Computer Go and Monte Carlo Tree Search: Opening Book and Parallel Solutions
TLDR
A method to guide a Monte Carlo Tree Search in the initial moves of the game of Go, which matches the current state of a Go board against clusters of board configurations that are derived from a large number of games played by experts.
Time Management for Monte-Carlo Tree Search Applied to the Game of Go
TLDR
Results indicate that clever time management can have a very significant effect on playing strength in the case of Monte-Carlo tree search.
A Survey of Monte Carlo Tree Search Methods
TLDR
A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.
Evaluation of Monte Carlo tree search and the application to Go
TLDR
A method of measuring the accuracy of Monte Carlo tree search in game programming using the win percentage of positions in a large database of game records as a benchmark and comparing the win probability obtained by simulations with the benchmark is presented.
Monte-Carlo Fork Search for Cooperative Path-Finding
TLDR
Nested MCFS (NMCFS) solves congestion problems in the literature finding better solutions than the state-of-the-art solutions, and it solves N-puzzles without hole near-optimally.
Beam Monte-Carlo Tree Search
TLDR
BMCTS significantly outperforms MCTS at equal time controls, and it is shown that the improvement is equivalent to an up to four-fold increase in computing time for MCTs.
Monte-Carlo tree search in Ms. Pac-Man
TLDR
A performance comparison between the proposed system and existing programs showed significant improvement in the performance of proposed system over existing programs was observed in terms of its ability to survive, implying the effectiveness of proposed method.
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search
TLDR
It is demonstrated that in some probabilistic planning benchmarks from the International Planning Competition (IPC), selecting a MCTS variant with a backup strategy different from Monte Carlo averaging can lead to substantially better results.
...
...

References

SHOWING 1-10 OF 31 REFERENCES
Associating Shallow and Selective Global Tree Search with Monte Carlo for 9*9 Go
  • B. Bouzy
  • Computer Science
    Computers and Games
  • 2004
TLDR
This exploration is based on Olga and Indigo, two experimental Monte-Carlo programs and provides a min-max algorithm that iteratively deepens the tree until one move at the root is proved to be superior to the other ones.
Move-Pruning Techniques for Monte-Carlo Go
TLDR
Two new pruning techniques are yielded: Miai Pruning (MP) and Set Pruning(SP), which clearly speed up the process of selecting a move on 9×9 boards, and MP improves slightly the playing level.
Monte-Carlo Go Developments
TLDR
Two Go programs are described, Olga and Oleg, developed by a Monte-Carlo approach that is simpler than Bruegmann’s (1993) approach, and the ever-increasing power of computers lead us to think that Monte- carlo approaches are worth considering for computer Go in the future.
On-Line Search for Solving Markov Decision Processes via Heuristic Sampling
TLDR
This paper investigates the problem of refining near optimal policies via online search techniques, tackling the local problem of finding an optimal action for a single current state of the system, and considers an on-line approach based on sampling: at each step, a randomly sampled look-ahead tree is developed to compute the optimalaction for the current state.
Searching with probabilities
TLDR
It is demonstrated that probability distributions, using a modified B*-type search algorithm, can successfully be used as a knowledge representation technique and it is shown that the use of probability distributions is superior to theUse of either single values or ranges.
A Bayesian Approach to Relevance in Game Playing
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
TLDR
This paper presents a new algorithm that, given only a generative model (a natural and common type of simulator) for an arbitrary MDP, performs on-line, near-optimal planning with a per-state running time that has no dependence on the number of states.
Using Selective-Sampling Simulations in Poker
TLDR
This paper describes work being done on developing a world-class poker-playing program and proposes selective sampling simulations as a general-purpose framework for building programs to achieve high performance in imperfect information games.
A Simulated Annealing Algorithm with Constant Temperature for Discrete Stochastic Optimization
TLDR
A modification of the simulated annealing algorithm designed for solving discrete stochastic optimization problems that uses a constant (rather than decreasing) temperature for estimating the optimal solution and shows that both variants of the method are guaranteed to converge almost surely to the set of global optimal solutions.
...
...