Using core beliefs for point-based value iteration


Recent research on point-based approximation algorithms for POMDPs demonstrated that good solutions to POMDP problems can be obtained without considering the entire belief simplex. For instance, the Point Based Value Iteration (PBVI) algorithm [Pineau et al., 2003] computes the value function only for a small set of belief states and iteratively adds more points to the set as needed. A key component of the algorithm is the strategy for selecting belief points, such that the space of reachable beliefs is well covered. This paper presents a new method for selecting an initial set of representative belief points, which relies on finding first the basis for the reachable belief simplex. Our approach has better worst-case performance than the original PBVI heuristic, and performs well in several standard POMDP tasks.

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Izadi2005UsingCB, title={Using core beliefs for point-based value iteration}, author={Masoumeh T. Izadi and Ajit V. Rajwade and Doina Precup}, booktitle={IJCAI}, year={2005} }