Learn More
Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found to be very effective for high-dimensional simulation optimization problems. The main idea is to estimate the gradient using simulation output performance measures at only <i>two</i> settings of the <i>N</i>-dimensional parameter vector being optimized rather than at the(More)
Based on recent results for multiarmed bandit problems, we propose an adaptive sampling algorithm that approximates the optimal value of a finite-horizon Markov decision process (MDP) with finite state and action spaces. The algorithm adaptively chooses which action to sample as the sampling process proceeds and generates an asymptotically unbiased(More)
We introduce a new randomized method called Model Reference Adaptive Search (MRAS) for solving global optimization problems. The method works with a parameterized probabilistic model on the solution space and generates at each iteration a group of candidate solutions. These candidate solutions are then used to update the parameters associated with the(More)
In this paper, we consider Simultaneous Perturbation Stochastic Approximation (SPSA) for function minimization. The standard assumption for convergence is that the function be three times differentiable, although weaker assumptions have been used for special cases. However, all work that we are aware of at least requires differentiability. In this paper, we(More)
We propose a time aggregation approach for the solution of inÿnite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a substantial reduction in(More)
" Finite-dimensional regulators for a class of infinite-dimensional systems, " Syst. [13] Q. Vu, " The operator equation AX 0 XB = C with unbounded operators A and B and related abstract Cauchy problems, " Mathematische Abstract—We propose a novel algorithm called evolutionary policy iteration (EPI) for solving infinite horizon discounted reward Markov(More)
W e present a new approach to pricing American-style derivatives that is applicable to any Markovian setting (i.e., not limited to geometric Brownian motion) for which European call-option prices are readily available. By approximating the value function with an appropriately chosen interpolation function, the pricing of an American-style derivative with(More)
A protocol mismatch occurs when heterogeneous networks try to communicate with each other. Such mismatches are inevitable due to the proliferation of a multitude of networking architectures, hardware, and software on one hand, and the need for global connectivity on the other hand. In order to circumvent this problem the solution of protocol conversion has(More)