Proper Policies in Infinite-State Stochastic Shortest Path Problems

  title={Proper Policies in Infinite-State Stochastic Shortest Path Problems},
  author={Dimitri P. Bertsekas},
  journal={IEEE Transactions on Automatic Control},
  • D. Bertsekas
  • Published 28 November 2017
  • Mathematics, Computer Science
  • IEEE Transactions on Automatic Control
We consider stochastic shortest path problems with infinite state and control spaces, a nonnegative cost per stage, and a termination state. We extend the notion of a proper policy, a policy that terminates within a finite expected number of steps, from the context of finite state space to the context of infinite state space. We consider the optimal cost function <inline-formula><tex-math notation="LaTeX">$J^*$</tex-math></inline-formula>, and the optimal cost function <inline-formula><tex-math… Expand
Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm
In the proposed local policy iteration ADP algorithm, the iterative decision rules are updated in a local space of the whole state space, which can significantly reduce the computational burden for the CPU in comparison with the conventional policy iteration algorithm. Expand
Online Abstract Dynamic Programming with Contractive Models
This paper addresses the abstract dynamic programming (DP) in the online scenario, where the abstract DP mapping is time-varying, instead of static. In this case, optimal costs and policies atExpand
Improved value iteration for neural-network-based stochastic optimal control design
In the improved value iteration ADP algorithm, for the first time, a new criteria is proposed to verify whether the obtained policy is stable or not for stochastic processes. Expand
2018 Index IEEE Transactions on Automatic Control Vol. 63
  • IEEE Transactions on Automatic Control
  • 2018
This index covers all technical items—papers, correspondence, reviews, etc.—that appeared in this periodical during 2018, and items from previous years that were commented upon or corrected in 2018.Expand
Pull or Wait: How to Optimize Query Age of Information
An optimal solution is derived for this SSP problem in the case of a single query as a stochastic shortest path (SSP) problem with uncountable state and action spaces and used as a building block for the solution of the PoW problem under periodic query arrivals. Expand


Stochastic Shortest Path Problems Under Weak Conditions
This paper weakens the conditions under which some of the basic analytical and algorithmic results for finite-state stochastic shortest path problems hold, and shows the validity of Bellman’s equation and value iteration, and constructs a convergent policy iteration algorithm that uses a diminishing sequence of perturbations. Expand
Stable Optimal Control and Semicontractive Dynamic Programming
  • D. Bertsekas
  • Mathematics, Computer Science
  • SIAM J. Control. Optim.
  • 2018
A new unifying notion of stable feedback policy is introduced, based on perturbation of the cost per stage, which in addition to implying convergence of the generated states to the destination, quantifies the speed of convergence. Expand
Regular Policies in Abstract Dynamic Programming
This work develops a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems. Expand
A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies
A mixed value and policy iteration method that allows the use of stationary policies in computing the optimal cost function in a manner that resembles policy iteration and uses the new convergence theorem for value iteration to establish the convergence of the method for the nonnegative cost case. Expand
An Analysis of Stochastic Shortest Path Problems
We consider a stochastic version of the classical shortest path problem whereby for each node of a graph, we must choose a probability distribution over the set of successor nodes so as to reach aExpand
Constrained Markov Decision Processes
INTRODUCTION Examples of Constrained Dynamic Control Problems On Solution Approaches for CMDPs with Expected Costs Other Types of CMDPs Cost Criteria and Assumptions The Convex Analytical ApproachExpand
An analysis of transient Markov decision processes
This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. WhereasExpand
Further topics on discrete-time Markov control processes
7 Ergodicity and Poisson's Equation.- 7.1 Introduction.- 7.2 Weighted norms and signed kernels.- A. Weighted-norm spaces.- B. Signed kernels.- C. Contraction maps.- 7.3 Recurrence concepts.- A.Expand
Markov Decision Processes: Discrete Stochastic Dynamic Programming
  • M. Puterman
  • Mathematics, Computer Science
  • Wiley Series in Probability and Statistics
  • 1994
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books. Expand
The transient case for discrete time dynamic programs is extended to allow for fairly general state and action spaces. Most of the finite state space results hold, including that the transient caseExpand