• Corpus ID: 235367658

Improving Social Welfare While Preserving Autonomy via a Pareto Mediator

@article{McAleer2021ImprovingSW,
  title={Improving Social Welfare While Preserving Autonomy via a Pareto Mediator},
  author={Stephen McAleer and John Lanier and Michael Dennis and Pierre Baldi and Roy Fox},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.03927}
}
Machine learning algorithms often make decisions on behalf of agents with varied and sometimes conflicting interests. In domains where agents can choose to take their own action or delegate their action to a central mediator, an open question is how mediators should take actions on behalf of delegating agents. The main existing approach uses delegating agents to punish non-delegating agents in an attempt to get all agents to delegate, which tends to be costly for all. We introduce a Pareto… 
1 Citations

Figures and Tables from this paper

Projects with allocated PhD studentships Algorithms and Data Analysis

  • Computer Science
  • 2020
This project aims to define formal generic models of administrative access control, based on the Category-Based Meta Model of access control (CBAC), which can be used to analyse access control systems and help identify the impact of changes made by administrators (impact change) on the overall security of the system.

References

SHOWING 1-10 OF 54 REFERENCES

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

Empirical results demonstrate that influence leads to enhanced coordination and communication in challenging social dilemma environments, dramatically increasing the learning curves of the deep RL agents, and leading to more meaningful learned communication protocols.

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

This work analyzes the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network on two Markov games and characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance.

Strong mediated equilibrium

Learning with Opponent-Learning Awareness

Results show that the encounter of two LOLA agents leads to the emergence of tit-for-tat and therefore cooperation in the iterated prisoners' dilemma, while independent learning does not, and LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods.

Inequity aversion improves cooperation in intertemporal social dilemmas

It is found that inequity aversion improves temporal credit assignment for the important class of intertemporal social dilemmas and helps explain how large-scale cooperation may emerge and persist.

Stable Opponent Shaping in Differentiable Games

Stable Opponent Shaping (SOS) is presented, a new method that interpolates between LOLA and a stable variant named LookAhead that converges locally to equilibria and avoids strict saddles in all differentiable games.

Mediators in position auctions

This paper introduces a study of mediators for games with incomplete information, and applies it to the context of position auctions, a central topic in electronic commerce.

Game-theoretic recommendations: some progress in an uphill battle

This work states that the central game-theoretic solution concept, the Nash equilibrium, does not provide a solution to what they believe to be the major challenges of game theory and the theory of multi-agent systems.

Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination

This paper introduces MERL (Multiagent Evolutionary RL), a hybrid algorithm that does not require an explicit alignment between local and global objectives, and uses fast, policy-gradient based learning for each agent by utilizing their dense local rewards.

Multi-armed bandits in multi-agent networks

This paper addresses the multi-armed bandit problem in a multi-player framework with a distributed variant of the well-known UCB1 algorithm that is optimal in the sense that in a complete network it scales down the regret of its single-player counterpart by the network size.
...