#### Filter Results:

#### Publication Year

1994

2016

#### Co-author

#### Key Phrase

#### Publication Venue

#### Data Set Used

Learn More

We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterative elimination of dominated strategies in normal form games. We prove that it iteratively eliminates very weakly dominated strategies… (More)

Classic heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree, or an acyclic graph (AO*). In this paper, we describe a novel generalization of heuristic search, called LAO*, that can find solutions with loops. We show that LAO* can be used to solve Markov decision problems and that it shares the advantage heuristic… (More)

Most algorithms for solving POMDPs iteratively improve a value function that implicitly represents a policy and are said to search in value function space. This paper presents an approach to solving POMDPs that represents a policy explicitly as a nite-state controller and iteratively improves the controller by search in policy space. Two related algorithms… (More)

Recent work shows that the memory requirements of best-first heuristic search can be reduced substantially by using a divide-and-conquer method of solution reconstruction. We show that memory requirements can be reduced even further by using a breadth-first instead of a best-first search strategy. We describe optimal and approximate breadth-first heuristic… (More)

We describe how to convert the heuristic search algorithm A* into an anytime algorithm that finds a sequence of improved solutions and eventually converges to an optimal solution. The approach we adopt uses weighted heuristic search to find an approximate solution quickly, and then continues the weighted search to find improved solutions as well as to… (More)

A new policy iteration algorithm for partially observable Markov decision processes is presented that is simpler and more eecient than an earlier policy iteration algorithm of Sondik (1971,1978). The key simpliication is representation of a policy as a nite-state controller. This representation makes policy evaluation straightforward. The pa-per's… (More)

We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers , which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information… (More)

We develop a hierarchical approach to planning for partially observable Markov decision processes (POMDPs) in which a policy is represented as a hierarchical finite-state controller. To provide a foundation for this approach, we discuss some extensions of the POMDP framework that allow us to formalize the process of abstraction by which a hierarchical… (More)

Contingent planning { constructing a plan in which action selection is contingent on imperfect information received during plan execution { can be formalized as the problem of solving a partially observable Markov decision process (POMDP). Traditional dynamic programming algorithms for POMDPs use a at state representation that enumerates all possible states… (More)

Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This… (More)