# Federated Bandit

@article{Zhu2021FederatedB,
title={Federated Bandit},
author={Zhaowei Zhu and Jingxuan Zhu and Ji Liu and Yang Liu},
journal={Proceedings of the ACM on Measurement and Analysis of Computing Systems},
year={2021},
volume={5},
pages={1 - 29}
}
• Zhaowei Zhu, +1 author Y. Liu
• Published 24 October 2020
• Computer Science
• Proceedings of the ACM on Measurement and Analysis of Computing Systems
In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret… Expand
1 Citations

#### Figures from this paper

Federated Multi-Armed Bandits
• Computer Science, Mathematics
• AAAI
• 2021
This paper proposes a general framework of FMAB and then studies two specific federated bandit models, solving the approximate model by proposing Federated Double UCB (Fed2-UCB), which constructs a novel “double UCB” principle accounting for uncertainties from both arm and client sampling. Expand

#### References

SHOWING 1-10 OF 94 REFERENCES
Private and Byzantine-Proof Cooperative Decision-Making
• Computer Science
• AAMAS
• 2020
This work provides upper-confidence bound algorithms that obtain optimal regret while being differentially-private and tolerant to byzantine agents, and requires no information about the network of connectivity between agents, making them scalable to large dynamic systems. Expand
Decentralized Cooperative Stochastic Bandits
• Computer Science
• NeurIPS
• 2019
A fully decentralized algorithm that uses an accelerated consensus procedure to compute (delayed) estimates of the average of rewards obtained by all the agents for each arm, and then uses an upper confidence bound (UCB) algorithm that accounts for the delay and error of the estimates. Expand
Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits
• Computer Science
• IJCAI
• 2017
An algorithm for the decentralized setting that uses a value-ofinformation based communication strategy and an exploration-exploitation strategy based on the centralized algorithm is introduced, and it is shown experimentally that it converges rapidly to the performance of the centralized method. Expand
Differentially-Private Federated Linear Bandits
• Computer Science, Mathematics
• NeurIPS
• 2020
This paper devise \textsc{FedUCB}, a multiagent private algorithm for both centralized and decentralized (peer-to-peer) federated learning, which provides competitive performance both in terms of pseudoregret bounds and empirical benchmark performance in various multi-agent settings. Expand
A Distributed Algorithm for Sequential Decision Making in Multi-Armed Bandit with Homogeneous Rewards*
• Computer Science
• 2020 59th IEEE Conference on Decision and Control (CDC)
• 2020
It is shown that when all the agents share a homogeneous distribution of each arm reward, the algorithm achieves guaranteed logarithmic regret for all N agents at the order of O((1 + 2ρ2)2 logT/N) when T is large. Expand
• Mathematics, Computer Science
• 2018 IEEE Conference on Decision and Control (CDC)
• 2018
A differentially private distributed algorithm, called private gossi» gradient descent, is proposed, which enables all N agents to converge to the true model, with a performance comparable to that of conventional centralized algorithms. Expand
Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms
• Computer Science, Mathematics
• 2016 IEEE 55th Conference on Decision and Control (CDC)
• 2016
This work rigorously characterize the influence of the communication graph structure on the decision-making performance of the group and proves the performance of state-of-the-art frequentist and Bayesian algorithms for cooperative distributed algorithms for multi-agent MAB problems in which agents communicate according to a fixed network graph. Expand
Distributed Learning in Multi-Armed Bandit With Multiple Players
• Computer Science, Mathematics
• IEEE Transactions on Signal Processing
• 2010
It is shown that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. Expand
Gossip-based distributed stochastic bandit algorithms
• Computer Science
• ICML
• 2013
This work shows that the probability of playing a suboptimal arm at a peer in iteration t = Ω(log N) is proportional to 1/(Nt) where N denotes the number of peers participating in the network. Expand
Differentially private, multi-agent multi-armed bandits
• Computer Science
• EWRL 2015
• 2015
Two algorithms built upon decentralized Time Division Fair Sharing method and upper confidence bounds are derived, where all decisions are taken based on private statistics, that provide regret guarantees that are almost as good as the non-private, multi-agent algorithm and demonstrate them empirically. Expand