# Federated Bandit

@article{Zhu2021FederatedB, title={Federated Bandit}, author={Zhaowei Zhu and Jingxuan Zhu and Ji Liu and Yang Liu}, journal={Proceedings of the ACM on Measurement and Analysis of Computing Systems}, year={2021}, volume={5}, pages={1 - 29} }

In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret… Expand

#### One Citation

Federated Multi-Armed Bandits

- Computer Science, Mathematics
- AAAI
- 2021

This paper proposes a general framework of FMAB and then studies two specific federated bandit models, solving the approximate model by proposing Federated Double UCB (Fed2-UCB), which constructs a novel “double UCB” principle accounting for uncertainties from both arm and client sampling. Expand

#### References

SHOWING 1-10 OF 94 REFERENCES

Private and Byzantine-Proof Cooperative Decision-Making

- Computer Science
- AAMAS
- 2020

This work provides upper-confidence bound algorithms that obtain optimal regret while being differentially-private and tolerant to byzantine agents, and requires no information about the network of connectivity between agents, making them scalable to large dynamic systems. Expand

Decentralized Cooperative Stochastic Bandits

- Computer Science
- NeurIPS
- 2019

A fully decentralized algorithm that uses an accelerated consensus procedure to compute (delayed) estimates of the average of rewards obtained by all the agents for each arm, and then uses an upper confidence bound (UCB) algorithm that accounts for the delay and error of the estimates. Expand

Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits

- Computer Science
- IJCAI
- 2017

An algorithm for the decentralized setting that uses a value-ofinformation based communication strategy and an exploration-exploitation strategy based on the centralized algorithm is introduced, and it is shown experimentally that it converges rapidly to the performance of the centralized method. Expand

Differentially-Private Federated Linear Bandits

- Computer Science, Mathematics
- NeurIPS
- 2020

This paper devise \textsc{FedUCB}, a multiagent private algorithm for both centralized and decentralized (peer-to-peer) federated learning, which provides competitive performance both in terms of pseudoregret bounds and empirical benchmark performance in various multi-agent settings. Expand

A Distributed Algorithm for Sequential Decision Making in Multi-Armed Bandit with Homogeneous Rewards*

- Computer Science
- 2020 59th IEEE Conference on Decision and Control (CDC)
- 2020

It is shown that when all the agents share a homogeneous distribution of each arm reward, the algorithm achieves guaranteed logarithmic regret for all N agents at the order of O((1 + 2ρ2)2 logT/N) when T is large. Expand

Differentially Private Gossip Gradient Descent

- Mathematics, Computer Science
- 2018 IEEE Conference on Decision and Control (CDC)
- 2018

A differentially private distributed algorithm, called private gossi» gradient descent, is proposed, which enables all N agents to converge to the true model, with a performance comparable to that of conventional centralized algorithms. Expand

Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms

- Computer Science, Mathematics
- 2016 IEEE 55th Conference on Decision and Control (CDC)
- 2016

This work rigorously characterize the influence of the communication graph structure on the decision-making performance of the group and proves the performance of state-of-the-art frequentist and Bayesian algorithms for cooperative distributed algorithms for multi-agent MAB problems in which agents communicate according to a fixed network graph. Expand

Distributed Learning in Multi-Armed Bandit With Multiple Players

- Computer Science, Mathematics
- IEEE Transactions on Signal Processing
- 2010

It is shown that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. Expand

Gossip-based distributed stochastic bandit algorithms

- Computer Science
- ICML
- 2013

This work shows that the probability of playing a suboptimal arm at a peer in iteration t = Ω(log N) is proportional to 1/(Nt) where N denotes the number of peers participating in the network. Expand

Differentially private, multi-agent multi-armed bandits

- Computer Science
- EWRL 2015
- 2015

Two algorithms built upon decentralized Time Division Fair Sharing method and upper confidence bounds are derived, where all decisions are taken based on private statistics, that provide regret guarantees that are almost as good as the non-private, multi-agent algorithm and demonstrate them empirically. Expand