Learn More
We introduce a sensitivity-based view to the area of learning and optimization of stochastic dynamic systems. We show that this sensitivity-based view provides a unified framework for many different disciplines in this area, including perturbation analysis, Markov decision processes, reinforcement learning, and identification and adaptive control. Many(More)
Two fundamental concepts and quantities, realization factors and performance potentials, are introduced for Markov processes. The relations among these two quantities and the group inverse of the infinitesimal generator are studied. It is shown that the sensitivity of the steady-state performance with respect to the change of the infinitesimal generator can(More)
One of the key challenges in the design of bandwidth allocation policies for a multi-services mobile cellular network is to guarantee the potentially different Quality of Service (QoS) requirement from diverse applications, while at the same to ensure that the scarce bandwidth be utilized efficiently. Complete Sharing (CS) and Dynamic Partition (DP) schemes(More)
Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop(More)
The basic concepts of three branches of game theory, leader-follower, cooperative, and two-person nonzero sum games, are reviewed and applied to the study of the Internet pricing issue. In particular, we emphasize that the cooperative game (also called the bargaining problem) provides an overall picture for the issue. With a simple model for Internet(More)
In this paper, we propose a new call admission control scheme called dual threshold bandwidth reservation, or DTBR scheme. The main novelty is that it builds upon a complete sharing approach, in which the channels in each cell are shared among the different traffic types and multiple thresholds are used to meet the specific quality-of-service (QoS)(More)
We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which can lead to a substantial reduction in(More)