Robust Markov Decision Processes

@article{Wiesemann2013RobustMD,
  title={Robust Markov Decision Processes},
  author={Wolfram Wiesemann and Daniel Kuhn and Berç Rustem},
  journal={Math. Oper. Res.},
  year={2013},
  volume={38},
  pages={153-183}
}
Markov decision processes MDPs are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use because of their sensitivity to distributional model parameters, which are typically unknown and have to be estimated by the decision maker. To counter the detrimental effects of estimation errors, we consider robust MDPs that offer probabilistic guarantees in view of the unknown parameters. To this end, we assume that an… 

Figures and Tables from this paper

Multi-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters

TLDR
This paper presents and evaluates approaches to compute the pure Pareto optimal policies in the value vector space for bounded-parameter MDPs (BMDPs), a popular model for performance analysis and optimization of stochastic systems.

Multi-Criteria Approaches to Markov Decision Processes with Uncertain Transition Parameters

Markov decision processes (MDPs) are a well established model for planing under uncertainty. In most situations the MDP parameters are estimates from real observations such that their values are not

Robust Markov Decision Processes: Beyond Rectangularity

TLDR
The robust counterpart of important structural results of classical MDPs, including the maximum principle and Blackwell optimality, are introduced and a computational study is provided to demonstrate the effectiveness of the approach in mitigating the conservativeness of robust policies.

Decomposition methods for solving Markov decision processes with multiple models of the parameters

TLDR
This work considers the problem of decision-making in Markov decision processes (MDPs) when the reward or transition probability parameters are not known with certainty and studies an approach in which the rewards and transition probabilities are unknown.

Robust Anytime Learning of Markov Decision Processes

TLDR
This work continuously learns the transition probabilities of an MDP in a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies that provably adhere to formal specifications.

Bayesian Risk Markov Decision Processes

TLDR
A new formulation, Bayesian risk Markov decision process (BR-MDP), to address parameter uncertainty in MDPs, where a risk functional is applied in nested form to the expected total cost with respect to the Bayesian posterior distributions of the unknown parameters is proposed.

A Bayesian Risk Approach to MDPs with Parameter Uncertainty

TLDR
A Bayesian risk approach to MDPs with parameter uncertainty, where a risk functional is applied in nested form to the expected discounted total cost with respect to the Bayesian posterior distributions of the unknown parameters in each time stage is proposed.

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

TLDR
This work proposes non-rectangular uncertainty sets that bound marginal moments of state-action features defined over entire trajectories through a decision process, which enables generalization to different portions of the state space while retaining appropriate uncertainty of the decision process.

Robust analysis of discounted Markov decision processes with uncertain transition probabilities

Optimal policies in Markov decision problems may be quite sensitive with regard to transition probabilities. In practice, some transition probabilities may be uncertain. The goals of the present

Quantile Markov Decision Processes

The goal of a traditional Markov decision process (MDP) is to maximize expected cumulative reward over a defined horizon (possibly infinite). In many applications, however, a decision maker may be
...

References

SHOWING 1-10 OF 36 REFERENCES

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty

TLDR
A set of percentile criteria that are conceptually natural and representative of the trade-off between optimistic and pessimistic views of the question are presented and the use of these criteria under different forms of uncertainty for both the rewards and the transitions is studied.

Robust Control of Markov Decision Processes with Uncertain Transition Matrices

TLDR
This work considers a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets, and shows that perfect duality holds for this problem, and that it can be solved with a variant of the classical dynamic programming algorithm, the "robust dynamic programming" algorithm.

The Robustness-Performance Tradeoff in Markov Decision Processes

TLDR
This paper proposes a method that computes the whole set of Pareto efficient policies in the performance-robustness plane when only the reward parameters are subject to uncertainty and shows that the strategy with the "optimal" tradeoff might be non-Markovian and hence is in general not tractable.

A Robust Approach to Markov Decision Problems with Uncertain Transition Probabilities

Abstract This paper considers a discrete-time infinite horizon discounted cost Markov decision problem in which the transition probability vector for each state-control pair is uncertain. A popular

Markov Decision Processes with Imprecise Transition Probabilities

TLDR
The numerical procedures for calculating an optimal max-min strategy are based on successive approximations, reward revision, and modified policy iteration, and the bounds that are determined are at least as tight as currently available bounds for the case where the transition probabilities are precise.

Markov Decision Processes: Discrete Stochastic Dynamic Programming

  • M. Puterman
  • Computer Science
    Wiley Series in Probability and Statistics
  • 1994
TLDR
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.

State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms

TLDR
A wide range of models in such areas as quality control, machine maintenance, internal auditing, learning, and optimal stopping are discussed within the POMDP-framework.

Primal and dual linear decision rules in stochastic and robust optimization

TLDR
This paper proposes an efficient method to estimate the approximation error introduced by this rather drastic means of complexity reduction: it applies the linear decision rule restriction not only to the primal but also to a dual version of the stochastic program.

Robust Dynamic Programming

  • G. Iyengar
  • Mathematics, Economics
    Math. Oper. Res.
  • 2005
TLDR
It is proved that when this set of measures has a certain "rectangularity" property, all of the main results for finite and infinite horizon DP extend to natural robust counterparts.