Risk-Sensitive Markov Control Processes

  title={Risk-Sensitive Markov Control Processes},
  author={Yun Shen and Wilhelm Stannat and Klaus Obermayer},
  journal={SIAM J. Control. Optim.},
We introduce a general framework for measuring risk in the context of Markov control processes with risk maps on general Borel spaces that generalize known concepts of risk measures in mathematical finance, operations research, and behavioral economics. Within the framework, applying weighted norm spaces to incorporate unbounded costs also, we study two types of infinite-horizon risk-sensitive criteria, discounted total risk and average risk, and solve the associated optimization problems by… 

Figures from this paper

Prospect-theoretic Q-learning

Model and Reinforcement Learning for Markov Games with Risk Preferences

A new model for non-cooperative Markov game which considers the interactions of risk-aware players is motivated and proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem.

Some contributions to Markov decision processes

In a nutshell, this thesis studies discrete-time Markov decision processes (MDPs) on Borel Spaces, with possibly unbounded costs, and both expected (discounted) total cost and long-run expected

Markov decision processes with iterated coherent risk measures

The Bellman optimality equation is established as well as the value and policy iteration algorithms, and the existence of a deterministic stationary optimal policy is shown.

Risk-sensitive Markov Control Processes with Strictly Convex Risk Maps

We fully develop the Lyapunov approach to optimal control problems of Markov control processes on general Borel spaces equipped with risk maps, especially, with strictly convex risk maps including

On Average Risk-sensitive Markov Control Processes

We introduce the Lyapunov approach to optimal control problems of average risk-sensitive Markov control processes with general risk maps. Motivated by applications in particular to behavioral

Risk-sensitive Markov Decision Processes

A family of model-free risk-sensitive reinforcement learning algorithms for solving the optimization problems corresponding to risk- sensitive valuations and it is shown that when appropriate utility functions are chosen, agents’ behaviors express key features of human behavior as predicted by prospect theory.

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

This work offers the first non-asymptotic theoretical analyses for the non-stationary risk-sensitive RL in the literature and presents a meta-algorithm that does not require any prior knowledge of the variation budget and can adaptively detect theNon-stationarity on the exponential value functions.

Privacy-Preserving Reinforcement Learning Beyond Expectation

This work designs an algorithm to enable an RL agent to learn policies to maximize a CPT-based objective in a privacy-preserving manner and establishes guarantees on the privacy of value functions learned by the algorithm when rewards are sufficiently close.



Further topics on discrete-time Markov control processes

7 Ergodicity and Poisson's Equation.- 7.1 Introduction.- 7.2 Weighted norms and signed kernels.- A. Weighted-norm spaces.- B. Signed kernels.- C. Contraction maps.- 7.3 Recurrence concepts.- A.

Convex measures of risk and trading constraints

The notion of a convex measure of risk is introduced, an extension of the concept of a coherent risk measure defined in Artzner et al. (1999), and a corresponding extensions of the representation theorem in terms of probability measures on the underlying space of scenarios are proved.

Discrete-time controlled Markov processes with average cost criterion: a survey

This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this

Subgradients of Law-Invariant Convex Risk Measures on L1

We introduce a generalised subgradient for law-invariant closed convex risk measures on L1 and establish its relationship with optimal risk allocations and equilibria. Our main result gives

Markov Decision Processes: Discrete Stochastic Dynamic Programming

  • M. Puterman
  • Computer Science
    Wiley Series in Probability and Statistics
  • 1994
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.

Yet Another Look at Harris’ Ergodic Theorem for Markov Chains

The aim of this note is to present an elementary proof of a variation of Harris’ ergodic theorem of Markov chains.

Risk Sensitive Markov Decision Processes

Risk-sensitive control is an area of significant current interest in stochastic control theory. It is a generalization of the classical, risk-neutral approach, whereby we seek to minimize an

Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications

Monotonicity properties of value functions and optimal policies are established in controlled Markov chains models with a countable state space, under (exponential) total and discounted risk-sensitive cost criteria.

Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains

It is established that the optimal average cost is characterized by an optimality inequality, and it is to shown that, even for bounded costs, such an inequality may be strict at every state.

Coherent Risk Measures on General Probability Spaces

We extend the definition of coherent risk measures, as introduced by Artzner, Delbaen, Eber and Heath, to general probability spaces and we show how to define such measures on the space of all random