Warren B. Powell

Learn More
We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives’ mean values, algorithms may utilize this dependence to perform efficiently even when the(More)
We address the problem of determining optimal stepsizes for estimating parameters in the context of approximate dynamic programming. The sufficient conditions for convergence of the stepsize rules have been known for 50 years, but practical computational work tends to use formulas with parameters that have to be tuned for specific applications. The problem(More)
In a companion paper (Godfrey and Powell 2002) we introduced an adaptive dynamic programming algorithm for stochastic dynamic resource allocation problems, which arise in the context of logistics and distribution, fleet management, and other allocation problems. The method depends on estimating separable nonlinear approximations of value functions, using a(More)
We consider a class of problems of scheduling n jobs on m identical, uniform, or unrelated parallel machines with an objective of minimizing an additive criterion. We propose a decomposition approach for solving these problems exactly. The decomposition approach rst formulates these problems as an integer program, and then reformulates the integer program,(More)
The dynamic assignment problem arises in a number of application areas in transportation and logistics. Taxi drivers have to be assigned to pick up passengers, police have to be assigned to emergencies, and truck drivers have to pick up and carry loads of freight. All of these problems are characterized by demands that arrive continuously and randomly(More)
We propose Dirichlet Process mixtures of Generalized Linear Models (DP-GLM), a new class of methods for nonparametric regression. Given a data set of input-response pairs, the DP-GLM produces a global model of the joint distribution through a mixture of local generalized linear models. DP-GLMs allow both continuous and categorical inputs, and can model the(More)
We derive a one-period look-ahead policy for finiteand infinite-horizon online optimal learning problems with Gaussian rewards. Our approach is able to handle the case where our prior beliefs about the rewards are correlated, which is not handled by traditional multi-armed bandit methods. Experiments show that our KG policy performs competitively against(More)
We formulate and solve the problem of making advance energy commitments for wind farms in the presence of a storage device with conversion losses, mean-reverting price process, and an auto-regressive energy generation process from wind. We derive an optimal commitment policy under the assumption that wind energy is uniformly distributed. Then, the(More)