Corpus ID: 232233375

Beyond log2(T) Regret for Decentralized Bandits in Matching Markets

@article{Basu2021BeyondLR,
  title={Beyond log2(T) Regret for Decentralized Bandits in Matching Markets},
  author={Soumya Basu and Karthik Abinav Sankararaman and Abishek Sankararaman},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.07501}
}
We design decentralized algorithms for regret minimization in two sided matching markets with one-sided bandit feedback that significantly improves upon the prior works [23, 27, 24]. First, for general markets, for any ε>0, we design an algorithm that achieves a O(log(T )) regret to the agent-optimal stable matching, with unknown time horizon T , improving upon the O(log(T )) regret achieved in [24]. Second, we provide the optimal Θ(log(T )) regret for markets satisfying uniqueness consistency… Expand

Figures from this paper

Decentralized Learning in Online Queuing Systems

References

SHOWING 1-10 OF 29 REFERENCES
Dominate or Delete: Decentralized Competing Bandits with Uniform Valuation
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
Competing Bandits in Matching Markets
Cooperative Multi-Agent Bandits with Heavy Tails
Game of Thrones: Fully Distributed Learning for Multiplayer Bandits
Social Learning in Multi Agent Multi Armed Bandits
Information sharing in distributed stochastic bandits
Collaborative learning of stochastic bandits over a social network
Decentralized Learning for Multiplayer Multiarmed Bandits
Matching while Learning
...
1
2
3
...