lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

  title={lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits},
  author={Kevin G. Jamieson and Matthew Malloy and Robert D. Nowak and S{\'e}bastien Bubeck},
The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples. The procedure cannot be improved in the sense that the number of samples required to identify the best arm is within a constant factor of a lower bound based on the law of the iterated logarithm (LIL). Inspired by the LIL, we construct our confidence bounds to explicitly account for the… CONTINUE READING
Highly Influential
This paper has highly influenced 18 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 136 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 3 times over the past 90 days. VIEW TWEETS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 99 extracted citations

137 Citations

Citations per Year
Semantic Scholar estimates that this publication has 137 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 16 references

Asymptotic behavior of expected sample size in certain one sided tests

  • R. H. Farrell
  • The Annals of Mathematical Statistics,
  • 1964
Highly Influential
10 Excerpts

Similar Papers

Loading similar papers…