Pruning Game Tree by Rollouts

  title={Pruning Game Tree by Rollouts},
  author={Bojun Huang},
  • Bojun Huang
  • Published in AAAI 25 January 2015
  • Computer Science
In this paper we show that the α-β algorithm and its successor MT-SSS*, as two classic minimax search algorithms, can be implemented as rollout algorithms, a generic algorithmic paradigm widely used in many domains. [] Key Method We show that any rollout policy in this family (either deterministic or randomized) is guaranteed to evaluate the game tree correctly with a finite number of rollouts. Moreover, we identify simple rollout policies in this family that "implement" α-β and MT-SSS*. Specifically, given…

Figures from this paper

A Rollout-Based Search Algorithm Unifying MCTS and Alpha-Beta

A single parameter makes it possible for the hybrid to subsume both MCTS as well as alpha-beta search as extreme cases, while allowing for a spectrum of new search algorithms in between.

An Adversarial Search Method Based on an Iterative Optimal Strategy

The main idea is that calculating the state values of the intermediate nodes involves not only the static evaluation function involved but also a search into the future, where the latter is given a higher weight.

A Learned Query Rewrite System using Monte Carlo Tree Search

A policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent, and a learning-based model to estimate the expected performance improvement of each rewritten query is proposed, which guides the tree search more accurately.

Learning to Play: Reinforcement Learning and Games

It is shown that the methods generalize to three games, hinting at artificial general intelligence, and an argument can be made that in doing so the authors failed the Turing test, since no human can play at this level.

Tic-Tac-Toe: Understanding the Minimax Algorithm

A new mathematical technique is deduced to define the winning game Tic-Tac-Toe by using a Min-max algorithm to predict the win, or draw, of a game by knowing the first move of a player.

Scaling Monte Carlo Tree Search on Intel Xeon Phi

This paper achieves, to the best of its knowledge, the fastest implementation of a parallel MCTS on the 61 core (= 244 hardware threads) Intel Xeon Phi using a real application (47 times faster than a sequential run).



Rollout-based Game-tree Search Outprunes Traditional Alpha-beta

This paper modifications a rollout-based method, FSSS, to allow for use in game-tree search and shows it outprunes alpha-beta both empirically and formally.

A Minimax Algorithm Better than Alpha-Beta?

Best-First Fixed-Depth Minimax Algorithms

A new formulation for Stockman's SSS ∗ algorithm, based on Alpha-Beta, is presented, finally transforming it into a practical algorithm, and a framework that facilitates the construction of several best-first fixed-depth game-tree search algorithms, known and new is presented.

Monte-Carlo Tree Search and minimax hybrids

This paper proposes M CTS-minimax hybrids that employ shallow minimax searches within the MCTS framework and investigates their effectiveness in the test domains of Connect-4 and Breakthrough.

Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search

A new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte- carlo phase is presented, that provides finegrained control of the tree growth, at the level of individual simulations, and allows efficient selectivity.

A Review of Game-Tree Pruning

  • T. Marsland
  • Computer Science
    J. Int. Comput. Games Assoc.
  • 1986
These essential parts of game-tree searching and pruning are reviewed here, and the performance of refinements, such as aspiration and principal variation search, and aids like transposition and history tables are compared.

Score Bounded Monte-Carlo Tree Search

The proposed algorithm improves significantly a MCTS solver to take into account bounds on the possible scores of a node in order to select the nodes to explore in games that can end in draw positions.

On-Line Search for Solving Markov Decision Processes via Heuristic Sampling

This paper investigates the problem of refining near optimal policies via online search techniques, tackling the local problem of finding an optimal action for a single current state of the system, and considers an on-line approach based on sampling: at each step, a randomly sampled look-ahead tree is developed to compute the optimalaction for the current state.

Finite-time Analysis of the Multiarmed Bandit Problem

This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.