# Proper Policies in Infinite-State Stochastic Shortest Path Problems

@article{Bertsekas2018ProperPI, title={Proper Policies in Infinite-State Stochastic Shortest Path Problems}, author={Dimitri P. Bertsekas}, journal={IEEE Transactions on Automatic Control}, year={2018}, volume={63}, pages={3787-3792} }

We consider stochastic shortest path problems with infinite state and control spaces, a nonnegative cost per stage, and a termination state. We extend the notion of a proper policy, a policy that terminates within a finite expected number of steps, from the context of finite state space to the context of infinite state space. We consider the optimal cost function <inline-formula><tex-math notation="LaTeX">$J^*$</tex-math></inline-formula>, and the optimal cost function <inline-formula><tex-math… Expand

#### 3 Citations

Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

- Computer Science
- IEEE Transactions on Systems, Man, and Cybernetics: Systems
- 2020

In the proposed local policy iteration ADP algorithm, the iterative decision rules are updated in a local space of the whole state space, which can significantly reduce the computational burden for the CPU in comparison with the conventional policy iteration algorithm. Expand

Online Abstract Dynamic Programming with Contractive Models

- Mathematics
- 2021

This paper addresses the abstract dynamic programming (DP) in the online scenario, where the abstract DP mapping is time-varying, instead of static. In this case, optimal costs and policies at… Expand

Improved value iteration for neural-network-based stochastic optimal control design

- Computer Science, Medicine
- Neural Networks
- 2020

In the improved value iteration ADP algorithm, for the first time, a new criteria is proposed to verify whether the obtained policy is stable or not for stochastic processes. Expand

#### References

SHOWING 1-10 OF 40 REFERENCES

Stochastic Shortest Path Problems Under Weak Conditions

- Computer Science
- 2013

This paper weakens the conditions under which some of the basic analytical and algorithmic results for finite-state stochastic shortest path problems hold, and shows the validity of Bellman’s equation and value iteration, and constructs a convergent policy iteration algorithm that uses a diminishing sequence of perturbations. Expand

Stable Optimal Control and Semicontractive Dynamic Programming

- Mathematics, Computer Science
- SIAM J. Control. Optim.
- 2018

A new unifying notion of stable feedback policy is introduced, based on perturbation of the cost per stage, which in addition to implying convergence of the generated states to the destination, quantifies the speed of convergence. Expand

Regular Policies in Abstract Dynamic Programming

- Computer Science, Mathematics
- SIAM J. Optim.
- 2017

This work develops a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems. Expand

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies

- Mathematics, Computer Science
- Math. Oper. Res.
- 2015

A mixed value and policy iteration method that allows the use of stationary policies in computing the optimal cost function in a manner that resembles policy iteration and uses the new convergence theorem for value iteration to establish the convergence of the method for the nonnegative cost case. Expand

An Analysis of Stochastic Shortest Path Problems

- Mathematics, Computer Science
- Math. Oper. Res.
- 1991

We consider a stochastic version of the classical shortest path problem whereby for each node of a graph, we must choose a probability distribution over the set of successor nodes so as to reach a… Expand

Constrained Markov Decision Processes

- Mathematics
- 1999

INTRODUCTION Examples of Constrained Dynamic Control Problems On Solution Approaches for CMDPs with Expected Costs Other Types of CMDPs Cost Criteria and Assumptions The Convex Analytical Approach… Expand

An analysis of transient Markov decision processes

- Mathematics
- Journal of Applied Probability
- 2006

This paper is concerned with the analysis of Markov decision processes in which a natural form of termination ensures that the expected future costs are bounded, at least under some policies. Whereas… Expand

Further topics on discrete-time Markov control processes

- Mathematics
- 1999

7 Ergodicity and Poisson's Equation.- 7.1 Introduction.- 7.2 Weighted norms and signed kernels.- A. Weighted-norm spaces.- B. Signed kernels.- C. Contraction maps.- 7.3 Recurrence concepts.- A.… Expand

Markov Decision Processes: Discrete Stochastic Dynamic Programming

- Mathematics, Computer Science
- Wiley Series in Probability and Statistics
- 1994

Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books. Expand

ON THE TRANSIENT CASE FOR MARKOV DECISION CHAINS WITH GENERAL STATE SPACES

- Mathematics, Computer Science
- 1978

The transient case for discrete time dynamic programs is extended to allow for fairly general state and action spaces. Most of the finite state space results hold, including that the transient case… Expand