Sequential Design of Experiments via Linear Programming


The celebrated multi-armed bandit problem in decision theory models the central trade-off between exploration, or learning about the state of a system, and exploitation, or utilizing the system. In this paper we study the variant of the multi-armed bandit problem where the exploration phase involves costly experiments and occurs before the exploitation… (More)

1 Figure or Table



Citations per Year

163 Citations

Semantic Scholar estimates that this publication has 163 citations based on the available data.

See our FAQ for additional information.

  • Presentations referencing similar topics