Eric A. Hansen

Learn More
Classic heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree, or an acyclic graph (AO*). In this paper, we describe a novel generalization of heuristic search, called LAO*, that can find solutions with loops. We show that LAO* can be used to solve Markov decision problems and that it shares the advantage heuristic(More)
We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterative elimination of dominated strategies in normal form games. We prove that it iteratively eliminates very weakly dominated strategies(More)
Most algorithms for solving POMDPs itera­ tively improve a value function that implic­ itly represents a policy and are said to search in value function space. This paper presents an approach to solving POMDPs that repre­ sents a policy explicitly as a finite-state con­ troller and iteratively improves the controller by search in policy space. Two related(More)
We describe how to convert the heuristic search algorithm A* into an anytime algorithm that finds a sequence of improved solutions and eventually converges to an optimal solution. The approach we adopt uses weighted heuristic search to find an approximate solution quickly, and then continues the weighted search to find improved solutions as well as to(More)
Recent work shows that the memory requirements of bestfirst heuristic search can be reduced substantially by using a divide-and-conquer method of solution reconstruction. We show that memory requirements can be reduced even further by using a breadth-first instead of a best-first search strategy. We describe optimal and approximate breadth-first heuristic(More)
Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This(More)
We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers, which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information(More)