Active Learning in Multi-armed Bandits

  title={Active Learning in Multi-armed Bandits},
  author={Andr{\'a}s Antos and Varun Grover and Csaba Szepesv{\'a}ri},
We consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The decision maker can select which option to generate the next sample from, the goal being to produce estimates with equally good precision for all the options. If sample means are used to estimate the unknown values then the optimal solution, assuming full knowledge of the distributions except their means, is to sample from each distribution proportional to its… CONTINUE READING
Highly Cited
This paper has 45 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Explore Further: Topics Discussed in This Paper


Publications referenced by this paper.

Theory of Optimal Experiments

  • V. V. Fedorov
  • Academic Press
  • 1972
Highly Influential
4 Excerpts

and Cs

  • A. Antos, V. Grover
  • Szepesvári. Active learning in multi-armed…
  • 2008
2 Excerpts

Measure Theory and Probability Theory

  • K. B. Athreya, S. N. Lahiri
  • Springer
  • 2006

A Probabilistic Theory of Pattern Recognition

  • L. Devroye, L. Györfi, G. Lugosi
  • Applications of Mathematics: Stochastic Modelling…
  • 1996

On efficient designing of nonlinear experiments

  • P. Chaudhuri, P. Mykland
  • Statistica Sinica, 5:421–440
  • 1995
1 Excerpt

Probability inequalities for sums of bounded random variables

  • W. Hoeffding
  • Journal of the American Statistical Association…
  • 1963
1 Excerpt

Similar Papers

Loading similar papers…