Progressive Strategies for Monte-Carlo Tree Search

@article{Chaslot2008ProgressiveSF,
  title={Progressive Strategies for Monte-Carlo Tree Search},
  author={Guillaume Chaslot and Mark H. M. Winands and H. Jaap van den Herik and Jos Uiterwijk and Bruno Bouzy},
  journal={New Mathematics and Natural Computation},
  year={2008},
  volume={04},
  pages={343-357}
}
Monte-Carlo Tree Search (MCTS) is a new best-first search guided by the results of Monte-Carlo simulations. In this article, we introduce two progressive strategies for MCTS, called progressive bias and progressive unpruning. They enable the use of relatively time-expensive heuristic knowledge without speed reduction. Progressive bias directs the search according to heuristic knowledge. Progressive unpruning first reduces the branching factor, and then increases it gradually again. Experiments… 

Figures and Tables from this paper

Beam Monte-Carlo Tree Search
TLDR
BMCTS significantly outperforms MCTS at equal time controls, and it is shown that the improvement is equivalent to an up to four-fold increase in computing time for MCTs.
Monte-Carlo Tree Search in Board Games
TLDR
This chapter gives an overview of popular and effective enhancements for board game playing mcts agents, and mentions techniques to parallelize mCTs in a straightforward but effective way.
A Survey of Monte Carlo Tree Search Methods
TLDR
A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.
Parallel Monte-Carlo Tree Search
TLDR
Three parallelization methods for MCTS are discussed: leaf parallelization, root Parallelization, and tree parallelization.
An Analysis of Monte Carlo Tree Search
TLDR
Experimental evidence is presented suggesting that, under certain smoothness conditions, uniformly random simulation policies preserve the ordering over action preferences, which explains the success of MCTS despite its common use of these rollouts to evaluate states.
Monte-Carlo Tree Search and minimax hybrids
TLDR
This paper proposes M CTS-minimax hybrids that employ shallow minimax searches within the MCTS framework and investigates their effectiveness in the test domains of Connect-4 and Breakthrough.
Enhancements for Multi-Player Monte-Carlo Tree Search
TLDR
This paper proposes two enhancements for MCTS in multi-player games: Progressive History and Multi-Player Monte-Carlo Tree Search Solver (MP-MCTS-Solver).
Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel
TLDR
This paper discusses the adaptation of MCTS to simultaneous move games, and introduces a new algorithm, Online Outcome Sampling (OOS), that approaches a Nash equilibrium strategy over time.
Single-Player Monte-Carlo Tree Search
TLDR
This paper proposes a new MCTS variant, called Single-Player Monte-Carlo Tree Search (SP-MCTS), which makes use of a straightforward Meta-Search extension and gained the highest score so far on the standardized test set.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 25 REFERENCES
Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search
TLDR
A new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte- carlo phase is presented, that provides finegrained control of the tree growth, at the level of individual simulations, and allows efficient selectivity.
Monte-Carlo Go Developments
TLDR
Two Go programs are described, Olga and Oleg, developed by a Monte-Carlo approach that is simpler than Bruegmann’s (1993) approach, and the ever-increasing power of computers lead us to think that Monte- carlo approaches are worth considering for computer Go in the future.
Monte-Carlo strategies for computer Go
TLDR
Objective Monte-Carlo is a move-selection strategy that adjusts the amount of exploration and exploitation automatically and outperforms the two classical strategies previously proposed for Monte- carlo Go: Simulated Annealing and Progressive Pruning.
Exploration exploitation in Go: UCT for Monte-Carlo Go
TLDR
A Monte-Carlo program, MoGo, is developed, which is the first computer Go program using UCT, among which efficient memory management, parametrization, ordering of non-visited nodes and parallelization are explained.
Modification of UCT with Patterns in Monte-Carlo Go
TLDR
A Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT, is developed, and the modification of UCT for Go application is explained and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo.
Monte-Carlo tree search in production management problems
TLDR
It is shown that Monte-Carlo Tree Search leads to a solution in a shorter period of time than this algorithm, with improved solutions for large problems.
Bandit Based Monte-Carlo Planning
TLDR
A new algorithm is introduced, UCT, that applies bandit ideas to guide Monte-Carlo planning and is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling.
Bandit Algorithms for Tree Search
TLDR
A Bandit Algorithm for Smooth Trees (BAST) is introduced which takes into account actual smoothness of the rewards for performing efficient "cuts" of sub-optimal branches with high confidence and is illustrated on a global optimization problem of a continuous function, given noisy values.
Monte-Carlo Go Reinforcement Learning Experiments
TLDR
The result obtained by the automatic learning experiments is better than the manual method by a 3-point margin on average, which is satisfactory, and the current results are promising on 19times19 boards.
Playing the Right Atari
  • T. Cazenave
  • Computer Science
    J. Int. Comput. Games Assoc.
  • 2007
TLDR
A simple yet powerful optimization for Monte-Carlo Go tree search that consists in dealing appropriately with strings that have two liberties is experimented.
...
1
2
3
...