Control of Markov chains with long-run average cost criterion: the dynamic programming equations

  title={Control of Markov chains with long-run average cost criterion: the dynamic programming equations},
  author={Vivek S. Borkar},
  journal={Siam Journal on Control and Optimization},
  • V. Borkar
  • Published 1 May 1989
  • Mathematics
  • Siam Journal on Control and Optimization
The long-run average cost control problem for discrete time Markov chains on a countable state space is studied in a very general framework. Necessary and sufficient conditions for optimality in terms of the dynamic programming equations are given when an optimal stable stationary strategy is known to exist (e.g., for the situations studied in [Stochastic Differential Systems, Stochastic Control Theory and Applications, IMA Vol. Math. App. 10, Springer-Verlag, New York, Berlin, 1988, pp. 57–77… 
Equilibrium control policies for Markov chains
  • Andreas A. Malikopoulos
  • Computer Science, Mathematics
    IEEE Conference on Decision and Control and European Control Conference
  • 2011
This paper addresses the problem of controlling a Markov chain so as to minimize the average cost per unit time and derives conditions guaranteeing that a saddle point exists for the new dual problem and shows that this saddle point is an equilibrium control policy for each state of the Markov chains.
On the existence of optimal stationary policies for average Markov decision processes with countable states
This paper studies the existence conditions of an optimal stationary policy in a countable-state Markov decision process under the long-run average criterion and concludes that the method is capable to handle the cost function unbounded from both below and above, only at the condition of continuity and ergodicity.
Average optimality for Markov decision processes in borel spaces: a new condition and approach
In this paper we study discrete-time Markov decision processes with Borel state and action spaces. The criterion is to minimize average expected costs, and the costs may have neither upper nor lower
Uniqueness and Stability of Optimal Policies of Finite State Markov Decision Processes
This paper considers infinite horizon discrete-time optimal control of Markov decision processes (MDPs) with finite state spaces and compact action sets and asserts that the class of MDPs with essentially unique and stable minimizing Markov actions contains the intersection of countably many open dense sets.
Recent results on conditions for the existence of average optimal stationary policies
This paper concerns countable state space Markov decision processes endowed with a (long-run expected)average reward criterion. For these models we summarize and, in some cases,extend some recent
Infinite horizon average cost dynamic programming subject to ambiguity on conditional distribution
This paper addresses the optimality of stochastic control strategies based on the infinite horizon average cost criterion, subject to total variation distance ambiguity on the conditional distribution of the controlled process, and derives a new dynamic programming recursion which minimizes the future ambiguity.
Infinite Horizon Average Cost Dynamic Programming Subject to Total Variation Distance Ambiguity
This work analyzes the infinite horizon minimax average cost Markov Control Model (MCM) for a class of controlled process conditional distributions, which belong to a ball, with respect to total variation distance metric, and shows that if the nominal controlled process distribution is irreducible, then for every stationary Markov control policy the maximizing conditional distribution of the controlled process is also irredUCible.
Discrete-time controlled Markov processes with average cost criterion: a survey
This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this
The convergence of value iteration in average cost Markov decision chains
  • L. Sennott
  • Mathematics, Computer Science
    Oper. Res. Lett.
  • 1996
The existence of an average cost optimal stationary policy is desired and J is the limit of v"n(.)/n, where v" n(.) is the minimum n-step expected cost.
On Optimality in Probability and Almost Surely for Processes with a Communication Property. I. The Discrete Time Case.
We establish conditions under which the strategy minimizing the expected value of a cost functional has a much stronger property; namely, it minimizes the random cost functional itself for all


Introduction to stochastic control
Abstract : The text treats stochastic control problems for Markov chains, discrete time Markov processes, and diffusion models, and discusses method of putting other problems into the Markovian
Hitting-time and occupation-time bounds implied by drift analysis with applications
Bounds of exponential type are derived for the first-hitting time and occupation times of a real-valued random sequence which has a uniform negative drift whenever the sequence is above a fixed
Controlled Markov Chains and Stochastic Networks
Controlled Markov chains with average cost criterion and with special cost and transition structures are studied. Existence of optimal stationary strategies is established for the average cost
On Minimum Cost Per Unit Time Control of Markov Chains
The “minimum cost per unit time” control problem is studied for a class of Markov chains that, though important in applications, does not fit the conventional framework for this problem. Existence of
Continuity of mean recurrence times in denumerable semi-Markov processes
SummaryFor a family of semi-Markov processes where the transition matrices for the embedded Markov chains and the mean sojourn times depend continuously on a parameter, we give equivalent as well as
A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices : (preprint)
In this paper we consider a set of denumerable stochastic matrices where the parameter set is a compact metric space. We give a number of simultaneous recurrence conditions on the stochastic matrices
Optimal control of service in tandem queues
The optimal policy is of the form u=a or u=0 according as x_{1} and S is a switching function and can be nonergodic, but it is ergodic for the case of average cost.
Markov decision processes
In this review particular emphasis is given to the work which has been done in the Netherlands, but the main line of the paper is determined by the development of the applicability of the available theory.