# Verification of indefinite-horizon POMDPs

@inproceedings{Bork2020VerificationOI, title={Verification of indefinite-horizon POMDPs}, author={Alexander Bork and Sebastian Junges and Joost-Pieter Katoen and Tim Quatmann}, booktitle={ATVA}, year={2020} }

The verification problem in MDPs asks whether, for any policy resolving the nondeterminism, the probability that something bad happens is bounded by some given threshold. This verification problem is often overly pessimistic, as the policies it considers may depend on the complete system state. This paper considers the verification problem for partially observable MDPs, in which the policies make their decisions based on (the history of) the observations emitted by the system. We present an…

## 10 Citations

Enforcing Almost-Sure Reachability in POMDPs

- Computer ScienceCAV
- 2021

This work presents an iterative symbolic approach that computes a winning region, that is, a set of system configurations such that all policies that stay within this set are guaranteed to satisfy the constraints.

Runtime Monitoring for Markov Decision Processes

- Computer ScienceArXiv
- 2021

This work investigates the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics and presents a tractable algorithm based on model checking conditional reachability probabilities, which demonstrates the applicability of the algorithms to a range of benchmarks.

Runtime Monitors for Markov Decision Processes

- Computer ScienceCAV
- 2021

This work investigates the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics and presents a tractable algorithm based on model checking conditional reachability probabilities, which demonstrates the applicability of the algorithms to a range of benchmarks.

On the Verification of Belief Programs

- Computer ScienceArXiv
- 2022

A formalism for belief programs based on a modal logic of actions and beliefs is proposed, which allows for PCTL-like temporal properties smoothly and investigates the decidability and undecidability for the veriﬁcation problem of belief programs.

Under-Approximating Expected Total Rewards in POMDPs

- Computer ScienceTACAS
- 2022

This work considers the problem: is the optimal expected total reward to reach a goal state in a partially observable Markov decision process (POMDP) below a given threshold and provides two techniques: a simple (cut-off) technique that uses a good policy on the POMDP, and a more advanced technique (belief clipping) that uses minimal shifts of probabilities between beliefs.

Reinforcement Learning under Partial Observability Guided by Learned Environment Models

- Computer ScienceArXiv
- 2022

This work combines Q-learning with IoAlergia, a method for learning Markov decision processes (MDP), and provides RL with additional observations in the form of abstract environment states by simulating new experiences on learned environment models to track the explored states.

Inductive Synthesis of Finite-State Controllers for POMDPs

- Computer ScienceArXiv
- 2022

A novel learning framework to obtain finite-state controllers (FSCs) for partially observable Markov decision processes and its applicability for indefinite-horizon specifications is presented.

Probabilistic Model Checking and Autonomy

- Computer ScienceAnnual Review of Control, Robotics, and Autonomous Systems
- 2021

An overview of probabilistic model checking is provided, focusing on models supported by the PRISM and PRISM-games model checkers, including fully observable and partially observable Markov decision processes, as well as turn-based and concurrent stochastic games, together with associated Probabilistic temporal logics.

Gradient-Descent for Randomized Controllers under Partial Observability

- Computer ScienceVMCAI
- 2022

This paper shows how to define and evaluate gradients of pMCs and investigates varieties of gradient descent techniques from the machine learning community to synthesize the probabilities in a pMC.

The Probabilistic Model Checker Storm

- Computer ScienceInternational Journal on Software Tools for Technology Transfer
- 2021

The main features of Storm are reported and how to effectively use them are explained and an empirical evaluation of different configurations of Storm on the QComp 2019 benchmark set is presented.

## References

SHOWING 1-10 OF 41 REFERENCES

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

- Computer Science, MathematicsAAAI
- 2020

This work shows how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy in a partially observable Markov decision process (POMDP).

Verification of Markov Decision Processes Using Learning Algorithms

- Computer ScienceATVA
- 2014

A general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs) and focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations.

Optimistic Value Iteration

- Computer ScienceCAV
- 2020

This paper obtains a lower bound via standard value iteration, uses the result to “guess” an upper bound, and proves the latter’s correctness, and presents this optimistic value iteration approach for computing reachability probabilities as well as expected rewards.

Solving POMDPs by Searching the Space of Finite Policies

- Computer ScienceUAI
- 1999

This paper explores the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size, and demonstrates good empirical results with a branch-and-bound method for finding globally optimal deterministic policies, and a gradient-ascent method forFinding locally optimal stochastic policies.

Verification and control of partially observable probabilistic systems

- Computer ScienceReal-Time Systems
- 2017

Probabilistic temporal logics are given that can express a range of quantitative properties of partially observable, probabilistic systems for both discrete and dense models of time, relating to the probability of an event’s occurrence or the expected value of a reward measure.

Bounded Model Checking for Probabilistic Programs

- Computer ScienceATVA
- 2016

This paper proposes an on–the–fly approach where the operational model is successively created and verified via a step–wise execution of the program, enabling to take key features of many probabilistic programs into account: nondeterminism and conditioning.

On the undecidability of probabilistic planning and related stochastic optimization problems

- Computer ScienceArtif. Intell.
- 2003

Motion planning under partial observability using game-based abstraction

- Computer Science2017 IEEE 56th Annual Conference on Decision and Control (CDC)
- 2017

This work addresses motion planning problems where agents move inside environments that are not fully observable and subject to uncertainties by exploiting typical structural properties of such scenarios; for instance, it assumes that agents have the ability to observe their own positions inside an environment.

Point-Based Value Iteration for Finite-Horizon POMDPs

- Computer ScienceJ. Artif. Intell. Res.
- 2019

This paper presents a general point-based value iteration algorithm for finite-horizon POMDP problems which provides solutions with guarantees on solution quality and introduces two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements.