Corpus ID: 560547

CLEAN rewards for improving multiagent coordination in the presence of exploration

@inproceedings{HolmesParker2013CLEANRF,
  title={CLEAN rewards for improving multiagent coordination in the presence of exploration},
  author={Chris HolmesParker and A. Agogino and Kagan Tumer},
  booktitle={AAMAS},
  year={2013}
}
In cooperative multiagent systems, coordinating the joint-actions of agents is difficult. One of the fundamental difficulties in such multiagent systems is the slow learning process where an agent may not only need to learn how to behave in a complex environment, but may also need to account for the actions of the other learning agents. Here, the inability of agents to distinguish the true environmental dynamics from those caused by the stochastic exploratory actions of other agents creates… Expand
CLEANing the reward: counterfactual actions to remove exploratory action noise in multiagent learning (extended abstract)
TLDR
This work introduces Coordinated Learning without Exploratory Action Noise (CLEAN) rewards and empirically demonstrate their benefits. Expand
Exploiting Structure and Agent-Centric Rewards to Promote Coordination in Large Multiagent Systems
When scaling systems to hundreds or thousands of agents, the ability of agents to observe their environment and to coordinate during decision making becomes increasingly difficult. This increasedExpand
CLEAN Learning to Improve Coordination and Scalability in Multiagent Systems
approved: Kagan Tumer Recent advances in multiagent learning have led to exciting new capabilities spanning fields as diverse as planetary exploration, air traffic control, military reconnaissance,Expand
Exploiting structure and utilizing agent-centric rewards to promote coordination in large multiagent systems
TLDR
This work couple a Factored-Action Factored Markov Decision Process (FA-FMDP) framework which exploits problem structure and establishes localized rewards for agents with reinforcement learning using agent-centric difference rewards which addresses agent decision making and promotes coordination by addressing the structural credit assignment problem. Expand
AN ABSTRACT OF THE THESIS OF
approved: Kagan Tumer Air traffic flow management over the U.S. airpsace is a difficult problem. Current man­ agement approaches lead to hundreds of thousands of hours of delay, costing billions ofExpand

References

SHOWING 1-5 OF 5 REFERENCES
Cooperative Multi-Agent Learning: The State of the Art
TLDR
This survey attempts to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics, and finds that this broad view leads to a division of the work into two categories. Expand
Reward shaping for valuing communications during multi-agent coordination
TLDR
This research presents a novel model of rational communication, that uses reward shaping to value communications, and employs this valuation in decentralised POMDP policy generation and an empirical evaluation of the benefits is presented in two domains. Expand
Reinforcement Learning: An Introduction
TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand
Introduction to Reinforcement Learning
TLDR
In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Expand
Evolving large scale UAV communication system
TLDR
Experimental results show that UAVs using evolutionary algorithms in combination with appropriately shaped evaluation functions can form a robust communication network and perform 180% better than a fixed baseline algorithm as well as 90%better than a basic evolutionary algorithm. Expand