Sequential Design of Experiments via Linear Programming

Abstract

The celebrated multi-armed bandit problem in decision theory models the central trade-off between exploration, or learning about the state of a system, and exploitation, or utilizing the system. In this paper we study the variant of the multi-armed bandit problem where the exploration phase involves costly experiments and occurs before the exploitation… (More)

1 Figure or Table

Topics

Statistics

02040'05'06'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

163 Citations

Semantic Scholar estimates that this publication has 163 citations based on the available data.

See our FAQ for additional information.

  • Presentations referencing similar topics