Bayesian Exploration: Incentivizing Exploration in Bayesian Games

@article{Mansour2016BayesianEI,
  title={Bayesian Exploration: Incentivizing Exploration in Bayesian Games},
  author={Y. Mansour and Aleksandrs Slivkins and Vasilis Syrgkanis and Zhiwei Steven Wu},
  journal={Proceedings of the 2016 ACM Conference on Economics and Computation},
  year={2016}
}
We consider a ubiquitous scenario in the Internet economy when individual decision-makers (henceforth, agents) both produce and consume information as they make strategic choices in an uncertain environment. This creates a three-way trade-off between exploration (trying out insufficiently explored alternatives to help others in the future), exploitation (making optimal decisions given the information discovered by other agents), and incentives of the agents (who are myopically interested in… Expand
Exploration and Persuasion
How to incentivize self-interested agents to explore when they prefer to exploit? Consider a population of self-interested agents that make decisions under uncertainty. They explore to acquire newExpand
Bayesian Exploration with Heterogeneous Agents
TLDR
This work considers Bayesian Exploration: a simple model in which the recommendation system (the “principal”) controls the information flow to the users and strives to incentivize exploration via information asymmetry, and allows heterogeneous users. Expand
Tutorial: Incentivizing and Coordinating Exploration∗
While exploration-exploitation tradeoffs are well-studied, in many scenarios exploration is performed by self-interested individuals (agents) who make their own decisions. For example, a decision toExpand
Bayesian Incentive-Compatible Bandit Exploration
TLDR
A black-box reduction from an arbitrary multi-arm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret is provided, which works for very general bandit settings, even ones that incorporate contexts and arbitrary partial feedback. Expand
Incentivizing Exploration by Heterogeneous Users
TLDR
An algorithm is proposed that incentivizes arms played infrequently in the past whose probability of being played in the next round would be small without incentives, and achieves expected cumulative regret of O(Ne + N log(T )), using expected cumulative payments of O (Ne). Expand
Incentivizing Bandit Exploration: Recommendations as Instruments
TLDR
A novel recommendation mechanism is provided that views the planner’s recommendations as a form of instrumental variables (IV) that only affect agents’ arm selection but not the observed rewards that enables the social learning process to minimize regret over the long term. Expand
Optimal Algorithm for Bayesian Incentive-Compatible Exploration
TLDR
This work considers a social planner faced with a stream of myopic selfish agents, and proposes an optimal algorithm for the planner, in the case that the actions realizations are deterministic and have limited support. Expand
Incentivizing and Coordinating Exploration (Tutorial proposal for ALT 2019)
While exploration-exploitation tradeoffs are well-studied, in many scenarios exploration is performed by self-interested individuals (agents) who make their own decisions. For example, a decision toExpand
(Almost) Free Incentivized Exploration from Decentralized Learning Agents
  • Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen
  • Mathematics, Computer Science
  • ArXiv
  • 2021
TLDR
It turns out that increasing the population of agents significantly lowers the principal’s burden of incentivizing, and when there are sufficiently many learning agents involved, the exploration process of the principal can be (almost) free. Expand
Bayesian Persuasion in Sequential Decision-Making
TLDR
It is shown that if the principal has the power to threaten the agent by not providing future signals, then the principal can efficiently design a threat-based strategy that guarantees the principal’s payoff as if playing against an agent who is far-sighted but myopic to future signals. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 63 REFERENCES
G T ] 1 J ul 2 01 8 Bayesian Exploration : Incentivizing Exploration in Bayesian Games *
We consider a ubiquitous scenario in the Internet economywhen individual decision-makers (henceforth, agents) both produce and consume information as they make strategic choices in an uncertainExpand
Bayesian Incentive-Compatible Bandit Exploration
TLDR
A black-box reduction from an arbitrary multi-arm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret is provided, which works for very general bandit settings, even ones that incorporate contexts and arbitrary partial feedback. Expand
Incentivizing exploration
We study a Bayesian multi-armed bandit (MAB) setting in which a principal seeks to maximize the sum of expected time-discounted rewards obtained by pulling arms, when the arms are actually pulled byExpand
Economic Recommendation Systems
TLDR
It turns out that when observability is factored in the scheme proposed by Kremer et al (JPE, 2014) is no longer incentive compatible, so a tight bound on how many other agents can each agent observe and still have an incentive-compatible algorithm and asymptotically optimal outcome is provided. Expand
Algorithmic Bayesian persuasion
TLDR
This paper examines persuasion through a computational lens, focusing on the celebrated Bayesian persuasion model of Kamenica and Gentzkow, and examines the sender's optimization task in three of the most natural input models for this problem, and essentially pin down its computational complexity in each. Expand
Truthful incentives in crowdsourcing tasks using regret minimization mechanisms
TLDR
This paper designs a novel, no-regret posted price mechanism, BP-UCB, for budgeted procurement in stochastic online settings and proves strong theoretical guarantees about the mechanism, and extensively evaluate it in simulations as well as on real data from the Mechanical Turk platform. Expand
Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems
TLDR
A multi-round version of the well-known principal-agent model, whereby in each round a worker makes a strategic choice of the effort level which is not directly observable by the requester, which significantly generalizes the budget-free online task pricing problems studied in prior work. Expand
Implementing the "Wisdom of the Crowd"
TLDR
The optimal disclosure policy of a planner whose goal is to maximizes social welfare is characterized, which is the implementation of what is known as the 'wisdom of the crowd'. Expand
Finite-time Analysis of the Multiarmed Bandit Problem
TLDR
This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support. Expand
Strategic Experimentation with Exponential Bandits
This paper studies a game of strategic experimentation with two-armed bandits whose risky arm might yield a payoff only after some exponentially distributed random time. Because of free-riding, thereExpand
...
1
2
3
4
5
...