Perseus: Randomized Point-based Value Iteration for POMDPs

@article{Spaan2005PerseusRP,
  title={Perseus: Randomized Point-based Value Iteration for POMDPs},
  author={M. Spaan and N. Vlassis},
  journal={J. Artif. Intell. Res.},
  year={2005},
  volume={24},
  pages={195-220}
}
Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agent's belief space. We present a randomized point-based value iteration algorithm called PERSEUS. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is… Expand
Anytime Point-Based Approximations for Large POMDPs
TLDR
The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI), and a theoretical analysis justifying the choice of belief selection technique is presented. Expand
Belief Selection in Point-Based Planning Algorithms for POMDPs
TLDR
An empirical evaluation illustrating how the performance of point-based value iteration (Pineau et al., 2003) varies with these approaches is presented. Expand
Point-Based Value Iteration for Finite-Horizon POMDPs
TLDR
This paper presents a general point-based value iteration algorithm for finite-horizon POMDP problems which provides solutions with guarantees on solution quality and introduces two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements. Expand
Exploration in POMDP belief space and its impact on value iteration approximation
Decision making under uncertainty is among the most challenging tasks in the artificial intelligence. Although solution methods to this class of problems are intractable in general, some promisingExpand
Anytime point based approximations for interactive POMDPs
Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact, POMDPs fail toExpand
Point-based online value iteration algorithm in large POMDP
TLDR
A point-based online value iteration (PBOVI) algorithm which involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes, exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online, and proposes a novel idea to reuse the belief states that have been searched to avoid repeated computation. Expand
Run-Time Improvement of Point-Based POMDP Policies
TLDR
This paper evaluates a variety of heuristics used to determine when plan repair might be useful, and then repair the plan by sampling a small number of additional belief points and recomputing the policy. Expand
Anytime Point Based Approximations for Interactive Pomdps Anytime Point Based Approximations for Interactive Pomdps
Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact POMDPs prove to beExpand
Prioritizing Point-Based POMDP Solvers
TLDR
This work generalizes the notion of prioritized backups to the POMDP framework, showing how existing algorithms can be improved by prioritizing backups and presenting a new algorithm, which is the prioritized value iteration, and showing empirically that it outperforms current point-based algorithms. Expand
Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes
TLDR
This work develops a novel pointbased value iteration algorithm that incorporates a greedy strategy to pick perception actions for each sampled belief point in each iteration, and proves that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 72 REFERENCES
Solving Factored POMDPs with Linear Value Functions
Partially Observable Markov Decision Processes (POMDPs) provide a coherent mathematical framework for planning under uncertainty when the state of the system cannot be fully observed. However, theExpand
Algorithms for partially observable markov decision processes
Partially Observable Markov Decision Process (POMDP) is a general sequential decision-making model where the effects of actions are nondeterministic and only partial information about world states isExpand
Learning Policies for Partially Observable Environments: Scaling Up
TLDR
This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation. Expand
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a model. Our approach isExpand
A point-based POMDP algorithm for robot planning
  • M. Spaan, N. Vlassis
  • Computer Science
  • IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004
  • 2004
TLDR
A simple, randomized procedure that performs value update steps that strictly improve the value of all belief points in each step that belongs to the family of point-based value iteration solution techniques for POMDP. Expand
Robot Planning in Partially Observable Continuous Domains
TLDR
Perseus, the previously proposed randomized point-based value iteration algorithm, is demonstrated in a simple robot planning problem in a continuous domain, where encouraging results are observed. Expand
Finding Approximate POMDP solutions Through Belief Compression
TLDR
This thesis describes a scalable approach to POMDP planning which uses low-dimensional representations of the belief space and demonstrates how to make use of a variant of Principal Components Analysis (PCA) called Exponential family PCA in order to compress certain kinds of large real-world PomDPs, and find policies for these problems. Expand
Approximate Planning in Large POMDPs via Reusable Trajectories
TLDR
Upper bounds on the sample complexity are proved showing that, even for infinitely large and arbitrarily complex POMDPs, the amount of data needed can be finite, and depends only linearly on the complexity of the restricted strategy class II, and exponentially on the horizon time. Expand
Value-Function Approximations for Partially Observable Markov Decision Processes
  • M. Hauskrecht
  • Computer Science, Mathematics
  • J. Artif. Intell. Res.
  • 2000
TLDR
This work surveys various approximation methods, analyzes their properties and relations and provides some new insights into their differences, and presents a number of new approximation methods and novel refinements of existing techniques. Expand
Exploiting structure to efficiently solve large scale partially observable markov decision processes
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use ofExpand
...
1
2
3
4
5
...