The variance-penalized stochastic shortest path problem

  title={The variance-penalized stochastic shortest path problem},
  author={Jakob Piribauer and Ocan Sankur and Christel Baier},
  booktitle={International Colloquium on Automata, Languages and Programming},
The stochastic shortest path problem (SSPP) asks to resolve the non-deterministic choices in a Markov decision process (MDP) such that the expected accumulated weight before reaching a target state is maximized. This paper addresses the optimization of the variance-penalized expectation (VPE) of the accumulated weight, which is a variant of the SSPP in which a multiple of the variance of accumulated weights is incurred as a penalty. It is shown that the optimal VPE in MDPs with non-negative… 

Figures from this paper



Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes

It is shown that the problem to decide whether a disjunction of weight-bounded reachability conditions holds almost surely under some scheduler belongs to NP ∩ coNP, is solvable in pseudo-polynomial time and is at least as hard as solving two-player mean-payoff games.

Variance minimization of parameterized Markov decision processes

  • L. Xia
  • Mathematics
    Discret. Event Dyn. Syst.
  • 2018
This paper studies the variance minimization problem of Markov decision processes (MDPs) in which the policy is parameterized by action selection probabilities or other general parameters and derives a difference formula of the reward variance under any two parametric policies.

Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance

  • L. Xia
  • Computer Science
  • 2020
This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together, and develops an iterative algorithm with a form of policy iteration, which is proved to converge to local optima both in the mixed and randomized policy space.

Computing Minimum and Maximum Reachability Times in Probabilistic Systems

The methods are based on algorithms to transform the original problems into problems that satisfy the required assumptions and lead to the efficient solution of two basic problems in the analysis of the reliability and performance of partially-specified systems.

Mean-Variance Tradeoffs in an Undiscounted MDP

A stationary policy and an initial state in an MDP Markov decision process induce a stationary probability distribution of the reward that generates the Pareto optima in the sense of high mean and low variance of the stationary distribution.

Computing Quantiles in Markov Reward Models

The complexity of quantile queries for until properties in discrete-time finite-state Markov decision processes with nonnegative rewards on states is studied and it is shown that qualitative quantiles queries can be evaluated in polynomial time and an exponential algorithm for the evaluation of quantitative quantile query is presented.

An Analysis of Stochastic Shortest Path Problems

We consider a stochastic version of the classical shortest path problem whereby for each node of a graph, we must choose a probability distribution over the set of successor nodes so as to reach a

Trading Performance for Stability in Markov Decision Processes

It is shown that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, and that the problem of finding such a strategy can be expressed as a set of constraints.