Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes

Abstract

Partially Observable Markov Decision Processes (POMDP) provide a standard framework for sequential decision making in stochastic environments. In this setting, an agent takes actions and receives observations and rewards from the environment. Many POMDP solution methods are based on computing a belief state, which is a probability distribution over possible… (More)
DOI: 10.1007/11564096_58

Topics

4 Figures and Tables

Slides referencing similar topics