- Published 2011 in ArXiv

BPS, the Bayesian Problem Solver, applies probabilistic inference and decision-theore tic control to flexible, resource-constrained problem-solving. This paper focuses on the Bayesian inference mechanism in BPS, and contrasts it with those of traditional heuristic search techniques. By performing sound inference, BPS can outperform traditional techniques with signifi cantly less computational effort. Empirical tests on the Eight Puzzle show that after only a few hundred node expansions, BPS makes better de cisions than does the best existing algorithm af ter several million node expansions. 1 Problem-Solving and Search Problem-solving may be formalized within the problem-space representation of [17], in which problems are specified by a set of possible states of the world and a set of operators, or transitions between states. A problem instance is specified by a single initial state, I, and set of goal states, G. The states and operators form the nodes and directed arcs of a state-space graph. Problem solving in the state-space requires applying a sequence of operators, or solution-path, to state I, yielding a state in G. 1.1 Complete Decision-Trees Existing state-space search algorithms are adap tations of the game-theoretic techniques for "This research was made possible by support from Heuristicrats, the N ationa.l Aeronautics and Space Ad ministration, and the Rand Corporation. 152 evaluating decision-trees. From the perspective of a problem-solver, the state-space graph can be viewed as a decision-tree, whose root node is the initial state. In this decision-tree, paths cor respond to sequences of operators, and leaves to terminal states of the problem. These leaves have associated payoffs, or out comes. In the two-player game of chess, for ex ample, the outcomes are win, loss and draw. Similarly, in single-agent path-planning do mains, the outcomes describe the desirability of goal states and the paths used to reach them. In theory, one can directly label each leaf with its outcome, and recursively label each internal node by assigning it the outcome of its most preferred child. Once the entire tree has been so labelled, the problem-solver need only move from the root node to the neighboring node la belled with the most preferred outcome. The MaxMin procedure[26) is a classic example of this constraint-satisfaction labelling technique. 1.2 Traditional Approaches The complexity of state-space problem-solving arises from the fact that most interesting prob lems have enormous, densely-connected state space graphs. Because of resource (e.g., compu tational) constraints, most problem-solvers will be unable to explore the entirety of the decision tree before being forced to commit to an action. Rather, one will see only a relatively small por tion of the entire decision-tree, and must select an operator to apply before knowing the out comes of all adjacent states with certainty. How should a partial decision-tree be interIncreasing Heuristic Error Figure 1: Search Horizon and Heuristic Error preted? The conventional wisdom in both ar tificial intelligence and decision analysis is to evaluate the partial decision-tree as a decision tree proper. Examples of this approach in clude the "averaging out and folding back" tech nique in decision analysis[20], and the Minimax algorithm in artificial intelligence[25], both of which trace their origin to MaxMin. In these techniques, heuristics are used to estimate the (unknown) outcomes of the frontier nodes (the leaves of the partial decision-tree). Then, for computational simplicity, these estimates are assumed to perfect, thus licensing a. problem solver to invoke the straightforward constraint satisfaction algorithm described above. In assuming the labels to be accurate, these traditional algorithms are liable to be fooled by error in the outcome estimates. However, it is generally assumed that this weakness of this face-value assumption can be compensated for by searching deeper in the tree. The belief that error is dimished by search ing deeper stems from the assumption that the heuristic error grows proportional to distance from the goal. AB searching deeper causes some of the frontier nodes to approach the goal, the increased accuracy of these estimates is claimed to improve decision-quality. Thus the tradi tional algorithms are expected to converge to correct decisions with deeper search. As Figure 1 suggests, this line of reasoning is 153 fiawed. As comparatively few of the nodes ex panded in a. search are closer to G than I is (ex ponential branching compounds the effect visi ble in this planar graph), most of the estimates at the frontier are fraught with error. Even un der optimistic assumptions about heuristic error (e.g., bounded relative error), the likelihood of a traditional technique being misled by an erro neous estimate will increase with search depth (analytical and empirical studies can be found in [9]). Only if a search algorithm includes a sound inference mechanism for interpreting heuristic estimates can one unequivocally con clude that increasing search depth yields better decisions. 2 A Bayesian Approach The Bayesian approach attempts to adjust the heuristic estimates in light of information from other nodes in the tree. Specifically, by model ling the error in the heuristic function as well as inter-node outcome constraints, one may deter mine, for each node, the probability of each pos sible outcome, conditioned on evidence provided by heuristic evaluations. In path-planning, for example, one would determine, for each node in the search graph, the probability distribution over possible distances from the nearest goal. In chess, one would determine the probability that each node leads to a win, loss or draw. Subsequently, one could take the action which maximizes expected utility[6]. To formalize this discussion, consider the search graph shown in Figure 2. From the root node, So, one must choose whether to move to St. S2 or S3. Let the (unknown) outcome of node s, be denoted by Oi. To make a ratio nal choice (i.e., maximize expected utility) one needs the beliefs P(Ot =a), P(Ot = b) , . . . P(02 =a), P(02 = b) , . . . P(03 =a) , P(03 = b) , . . . where a, b, ... are the possible outcomes. Unam bigously, let P(Oi) denote the vector of values (P(Oi = a), P(Oi = b), . . . ) . Figure 2: Search Graph Once a heuristic evaluation for a.ll the nodes in the search graph has been recorded, one must determine the three v-ectors P( O,lh( So), h(S1), h(S2), h(S3), h(S2t), h(S22)) where i E {1, 2, 3} and h(Si) is the heuristic evaluation of node S;. This requires a model of the probabilistic relationships between heuris tic values and outcomes, and of the inter-node outcome constraints. 2.1 Probabilistic Heuristic Estimates

@article{Hansson2011HeuristicSA,
title={Heuristic Search as Evidential Reasoning},
author={Othar Hansson and Andy Mayer},
journal={CoRR},
year={2011},
volume={abs/1304.1509}
}