Learn More
We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives' mean values, algorithms may utilize this dependence to perform efficiently even when the(More)
We consider the problem of 20 questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We(More)
We consider information collection problems, in which we must decide how much and of what type of information to collect. We focus our interest on sequential Bayesian information collection problems. In making such decisions we trade the benefit of information (the ability to make better decisions in the future) against its cost (money, time, or opportunity(More)
In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we refer to as the knowledge-gradient policy. This policy myopically maximizes the expected increment in the value of information in each time period, where the value is measured(More)
Bisection search is the most efficient algorithm for locating a unique point X * ∈ [0, 1] when we are able to query an oracle only about whether X * lies to the left or right of a point x of our choosing. We study a noisy version of this classic problem, where the oracle's response is correct only with probability p. The probabilistic bisection algorithm(More)
Latent feature models are widely used to decompose data into a small number of components. Bayesian nonparametric variants of these models, which use the Indian buffet process (IBP) as a prior over latent features, allow the number of features to be determined from the data. We present a generalization of the IBP, the distance dependent Indian buffet(More)
We derive a one-period look-ahead policy for finite-and infinite-horizon online optimal learning problems with Gaussian rewards. Our approach is able to handle the case where our prior beliefs about the rewards are correlated, which is not handled by traditional multi-armed bandit methods. Experiments show that our KG policy performs competitively against(More)
W e present a new technique for adaptively choosing the sequence of molecular compounds to test in drug discovery. Beginning with a base compound, we consider the problem of searching for a chemical derivative of the molecule that best treats a given disease. The problem of choosing molecules to test to maximize the expected quality of the best compound(More)