The K-armed Dueling Bandits Problem

  title={The K-armed Dueling Bandits Problem},
  author={Yisong Yue and Josef Broder and Robert D. Kleinberg and Thorsten Joachims},
  journal={J. Comput. Syst. Sci.},
We study a partial-information online-learning problem where actions are restricted to noisy comparisons between pairs of strategies (also known as bandits). In contrast to conventional approaches that require the absolute reward of the chosen strategy to be quantifiable and observable, our setting assumes only that (noisy) binary feedback about the relative reward of two chosen strategies is available. This type of relative feedback is particularly appropriate in applications where absolute… CONTINUE READING
Highly Influential
This paper has highly influenced 26 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 178 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Explore Further: Topics Discussed in This Paper


Publications citing this paper.

179 Citations

Citations per Year
Semantic Scholar estimates that this publication has 179 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 17 references

How do we get weak action dependence for learning with partial observations?

John Langford

, and Thorsten Joachims . How does clickthrough data reflect retrieval quality ?

Madhu Kurup
SIAM Conference on Data Mining ( SDM ) • 2007

Regret Minimization Under Partial Monitoring

2006 IEEE Information Theory Workshop - ITW '06 Punta del Este • 2006

Tsitsiklis , The sample complexity of exploration in the multi - armed bandit problem

Shie Mannor, N. John
J . Mach . Learn . Res . • 2004

Similar Papers

Loading similar papers…