# CHAPTER 7 Semi-Markov Decision Processes 7

@inproceedings{2004CHAPTER7S, title={CHAPTER 7 Semi-Markov Decision Processes 7}, author={}, year={2004} }

The previous chapter dealt with the discrete-time Markov decision model. In this model, decisions can be made only at fixed epochs t = 0, 1, . . . . However, in many stochastic control problems the times between the decision epochs are not constant but random. A possible tool for analysing such problems is the semiMarkov decision model. In Section 7.1 we discuss the basic elements of this model. Also, for the optimality criterion of the long-run average cost per time unit, we give a data… Expand

#### References

SHOWING 1-10 OF 20 REFERENCES

A Dynamic Programming Algorithm for Embedded Markov Chains when the Planning Horizon is at Infinity

- Mathematics
- 1964

This paper presents an algorithm for the solution of dynamic programming problems requiring the determination of optimal policies for the control of a special class of stochastic processes when the… Expand

A GENERAL MARKOV DECISION METHOD II: APPLICATIONS

- Mathematics
- 1977

In a preceding paper [2] we have introduced a new approach for solving a wide class of Markov decision problems in which the state-space may be general and the system may be continuously controlled.… Expand

Joining the right queue: A Markov decision-rule

- Computer Science
- 26th IEEE Conference on Decision and Control
- 1987

Numerical results show that in cases where the queues are dissimilar in both the service rates and numbers of their servers, the 'separable' rule is strikingly superior to the shortest-queue rule. Expand

Applying a New Device in the Optimization of Exponential Queuing Systems

- Computer Science
- Oper. Res.
- 1975

A new definition of the time of transition is provided, which is able to utilize the inductive approach in a manner characteristic of inventory theory, and a policy optimal for all sufficiently small discount factors can be obtained from the usual average cost functional equation without recourse to further computation. Expand

Optimal control of a queueing system with heterogeneous servers and setup costs

- Mathematics, Computer Science
- IEEE Trans. Autom. Control.
- 2000

The purpose of this paper is to develop a tailor-made policy iteration algorithm for computing the optimal switch-on and switch-off levels for the slower server in a queueing model with batch Poisson input and two heterogeneous servers. Expand

CONVERGENCE OF SIMULATION-BASED POLICY ITERATION

- Mathematics
- Probability in the Engineering and Informational Sciences
- 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for computing optimal policies for Markov decision processes. At each iteration, rather than solving the… Expand

Generalized polynomial approximations in Markovian decision processes

- Mathematics
- 1985

Abstract Fitting the value function in a Markovian decision process by a linear superposition of M basis functions reduces the problem dimensionality from the number of states down to M , with good… Expand

Iterative solution of the functional equations of undiscounted Markov renewal programming

- Mathematics
- 1971

Abstract An iterative procedure is described for finding a solution of the functional equations v i ∗ = max k [q i k −g ∗ T i k + ∑ j=1 N P ij k v j ∗ ] 1⩽i⩽N of undiscounted Markov renewal… Expand

State-dependent routing for telephone traffic: Theory and results

- Computer Science
- 1986 25th IEEE Conference on Decision and Control
- 1986

An analytical model is constructed for state-dependent routing schemes which select a route for a call on the basis of the network 'state' at the time of call-arrival in the framework of Markov decision processes, and derives a simple state- dependent routing scheme called 'separable' routing. Expand

A heuristic rule for routing customers to parallel servers

- Computer Science
- 1997

A heuristic solution method yielding a good suboptimal rule will be given for the case of server groups with different and generally distributed service times based on a decomposition approach and first principles from Markov decision theory. Expand