Learn More
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated o.d.e. This in turn implies convergence of the algorithm. Several speciic classes of algorithms are considered as applications. It is found that the results provide (i) a simpler derivation of known results for(More)
In Part I we developed stability concepts for discrete chains, together with Foster-Lyapunov criteria for them to hold. Part II was devoted to developing related stability concepts for continuous-time processes. In this paper we develop criteria for these forms of stability for continuous-parameter Markovian processes on general state spaces, based on(More)
—We study different notions of capacity for time-slotted ALOHA systems. In these systems, multiple users synchronously send packets in a bursty manner over a common additive white Gaussian noise (AWGN) channel. The users do not coordinate their transmissions, which may collide at the receiver. For such a system, we define both single-slot capacity and(More)
The average cost optimal control problem is addressed for Markov decision processes with unbounded cost. It is found that the policy iteration algorithm generates a sequence of policies which are c-regular (a strong stability condition), where c is the cost function under consideration. This result only requires the existence of an initial c-regular policy,(More)
This paper establishes new criteria for stability and for instability of multiclass network models under a given stationary policy. It also extends previous results on the approximation of the solution to the average cost optimality equations through an associated fluid model: It is shown that an optimized network possesses a fluid limit model which is(More)
This paper considers in parallel the scheduling problem for multi-class queueing networks, and optimization of Markov decision processes. It is shown that the value iteration algorithm may perform poorly when the algorithm is not initialized properly. The most typical case where the initial value function is taken to be zero may be a particularly bad(More)
We address the problem of computing the optimal Q-function in Markov decision problems with infinite state-space. We analyze the convergence properties of several variations of <i>Q</i>-learning when combined with function approximation, extending the analysis of TD-learning in (Tsitsiklis &amp; Van Roy, 1996a) to stochastic control settings. We identify(More)