# Max-Min Grouped Bandits

@article{Wang2021MaxMinGB,
title={Max-Min Grouped Bandits},
author={Zhenling Wang and John Scarlett},
journal={ArXiv},
year={2021},
volume={abs/2111.08862}
}
• Published 17 November 2021
• Computer Science
• ArXiv
In this paper, we introduce a multi-armed bandit problem termed max-min grouped bandits, in which the arms are arranged in possibly-overlapping groups, and the goal is to find a group whose worst arm has the highest mean reward. This problem is of interest in applications such as recommendation systems, and is also closely related to widely-studied robust optimization problems. We present two algorithms based successive elimination and robust optimization, and derive upper bounds on the number…

## Figures and Tables from this paper

• Computer Science
ArXiv
• 2022
The instance-dependent and worst-case regret are characterized, and a matching lower bound for the latter is provided, while discussing various strengths, weaknesses, algorithmic improvements, and potential lower bounds associated with the instance- dependent upper bounds.
• Materials Science
ArXiv
• 2022
Junwen Yang Institute of Operations Research and Analytics, National University of Singapore, Singapore 117602, junwen yang@u.nus.edu Zixin Zhong Department of Electrical and Computer Engineering,

## References

SHOWING 1-10 OF 24 REFERENCES

• Computer Science
NIPS
• 2011
This work proposes an algorithm called Gap-based Exploration (GapE) that focuses on the arms whose mean is close to the mean of the best arm in the same bandit (i.e., small gap), and introduces an algorithm, called GapE-V, which takes into account the variance of the arms in addition to their gap.
• Computer Science
2019 IEEE International Symposium on Information Theory (ISIT)
• 2019
The number of total arm pulls required for high-probability best-arm identification in every group is bound, and two algorithms for this problem based on successive elimination and lower/upper confidence bounds (LUCB) are presented.
• Computer Science
ICML
• 2012
The expected sample complexity bound for LUCB is novel even for single-arm selection, and a lower bound on the worst case sample complexity of PAC algorithms for Explore-m is given.
• Computer Science, Mathematics
CIRCLE
• 2019
Three concepts of ordering between categories, inspired by stochastic dominance between random variables, are introduced, which are gradually weaker so that more and more bandit scenarios satisfy at least one of them.
• Computer Science
2014 48th Annual Conference on Information Sciences and Systems (CISS)
• 2014
It is shown that most best-arm algorithms can be described as variants of the two recent optimal algorithms that were proposed that achieve the optimal sample complexity for the problem.
• Computer Science
ArXiv
• 2020
Learning algorithms based on the UCB principle are developed which utilize these additional side observations appropriately while performing exploration-exploitation trade-off in the classical multi-armed bandit problem.
• Computer Science
COLT
• 2013
This work considers the problem of eciently exploring the arms of a stochastic bandit to identify the best subset of a specied size and derives improved bounds by using KL-divergence-based condence intervals.
• Computer Science
ICML
• 2017
A new Partitioned Robust (PRo) submodular maximization algorithm that achieves the same guarantee for more general $\tau = o(k)$ and numerically demonstrates the performance of PRo in data summarization and influence maximization.
• Computer Science, Mathematics
J. Mach. Learn. Res.
• 2003
This work considers the Multi-armed bandit problem under the PAC (“probably approximately correct”) model and generalizes the lower bound to a Bayesian setting, and to the case where the statistics of the arms are known but the identities of the Arms are not.
• Computer Science
IJCAI
• 2019
The META algorithm is developed, which effectively hedges between two other algorithms: one which uses both historical observations and clustering, and another which uses only the historical observations.