# Simple Strategies in Multi-Objective MDPs

@article{Delgrange2020SimpleSI,
title={Simple Strategies in Multi-Objective MDPs},
author={Florent Delgrange and Joost-Pieter Katoen and Tim Quatmann and Mickael Randour},
journal={Tools and Algorithms for the Construction and Analysis of Systems},
year={2020},
volume={12078},
pages={346 - 364}
}
• Published 24 October 2019
• Computer Science
• Tools and Algorithms for the Construction and Analysis of Systems
We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining a Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to…
13 Citations
Simple Strategies in Multi-Objective MDPs (Technical Report)
• Computer Science
ArXiv
• 2019
It is shown that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and the authors provide an MILP encoding to solve the corresponding problem.
On Minimizing Total Discounted Cost in MDPs Subject to Reachability Constraints
• Mathematics
• 2021
We study the synthesis of a policy in a Markov decision process (MDP) following which an agent reaches a target state in the MDP while minimizing its total discounted cost. The problem combines a
Stochastic Games with Disjunctions of Multiple Objectives (Technical Report)
• Computer Science
GandALF
• 2021
A fine-grained overview of strategy and computational complexity is presented and a novel value iteration-style algorithm for approximating the set of Pareto optimal thresholds for a given DQ is proposed.
Evolutionary-Guided Synthesis of Verified Pareto-Optimal MDP Policies
• Computer Science
2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)
• 2021
This work uses case studies from the service-based systems and robotic control software domains to show that the new MDP policy synthesis approach can handle a wide range of QoS requirement combinations unsupported by current probabilistic model checkers.
Arena-Independent Finite-Memory Determinacy in Stochastic Games
• Computer Science
CONCUR
• 2021
These contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs.
Multi-Objective Controller Synthesis with Uncertain Human Preferences
• Computer Science
2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS)
• 2022
This work formalizes the notion of uncertain human preferences, and presents a novel approach that accounts for this uncertainty in the context of multi-objective controller synthesis for Markov decision processes (MDPs).
Stochastic Games with Lexicographic Reachability-Safety Objectives
• Computer Science
CAV
• 2020
An algorithm is presented that computes lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games.
Different strokes in randomised strategies: Revisiting Kuhn's theorem under finite-memory assumptions
• Computer Science
ArXiv
• 2022
This work studies two-player turn-based stochastic games and provides a complete taxonomy of the classes of ﬁnite-memory strategies obtained by varying which of the three aforementioned components are randomised.
Games Where You Can Play Optimally with Arena-Independent Finite Memory
• Computer Science
CONCUR
• 2020
A complete characterization of preference relations that admit optimal strategies using arena-independent finite memory is established, generalizing the work of Gimbert and Zielonka to the finite-memory case and proving an equivalent to their celebrated corollary.
Entropy-Guided Control Improvisation
• Computer Science
Robotics: Science and Systems
• 2021
This framework, which extends the state-of-the-art by supporting arbitrary combinations of adversarial and probabilistic uncertainty in the environment, enables a flexible modeling formalism which it is argued, theoretically and empirically, remains tractable.

## References

SHOWING 1-10 OF 51 REFERENCES
Simple Strategies in Multi-Objective MDPs (Technical Report)
• Computer Science
ArXiv
• 2019
It is shown that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and the authors provide an MILP encoding to solve the corresponding problem.
Markov Decision Processes with Multiple Objectives
• Computer Science
STACS
• 2006
It is shown that every Pareto-optimal point can be achieved by a memoryless strategy; however, unlike in the single-objective case, the memoryless strategies may require randomization.
Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes
• Computer Science
2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science
• 2015
This work considers Markov decision processes with multiple limit-average objectives with multiple mean-payoff objectives, and presents a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve the problem.
On Finding Compromise Solutions in Multiobjective Markov Decision Processes
• Computer Science
ECAI
• 2010
This work uses an alternative optimality concept which formalizes the notion of best compromise solution, i.e. a policy yielding an expected-utility vector as close as possible to a reference point to show that this notion of optimality depends on the initial state.
Threshold Constraints with Guarantees for Parity Objectives in Markov Decision Processes
• Computer Science
ICALP
• 2017
This work extends the framework of [BFRR14] and follow-up papers, by addressing the case of $\omega$-regular conditions encoded as parity objectives, a natural way to represent functional requirements of systems by establishing that, for all variants of this problem, deciding the existence of a strategy lies in ${\sf NP} \cap {\sf coNP}$.
Multi-Objective Model Checking of Markov Decision Processes
• Computer Science, Mathematics
Log. Methods Comput. Sci.
• 2007
It is shown that one can compute an approximate Pareto curve with respect to a set of ω-regular properties in time polynomial in the size of the MDP.
Multi-weighted Markov Decision Processes with Reachability Objectives
• Mathematics
GandALF
• 2018
In this paper, we are interested in the synthesis of schedulers in double-weighted Markov decision processes, which satisfy both a percentile constraint over a weighted reachability condition, and a
Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes
• Computer Science
2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
• 2007
It is proved that the CON-MODP algorithm converges to the Pareto optimal set of value functions and policies for deterministic infinite horizon discounted multi-objective Markov decision processes.
A Survey of Multi-Objective Sequential Decision-Making
• Computer Science
J. Artif. Intell. Res.
• 2013
This article surveys algorithms designed for sequential decision-making problems with multiple objectives and proposes a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function, and the type of policies considered.
Markov Automata with Multiple Objectives
• Computer Science
CAV
• 2017
Algorithms to analyze several objectives simultaneously and approximate Pareto curves are presented, including, e.g., several (timed) reachability objectives, or various expected cost objectives.