Learn More
We report on an investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems. Specifically, we focus on a novel action selection strategy for Q-learning (Watkins 1989). The new technique is applicable to scenarios where mutual observation of actions is not possible.To date, reinforcement learning(More)
and the AgentLink Community. This document may be copied and redistributed provided that all copies are complete and preserve this notice. Multiple copying for instructional purposes is permitted but should be notified to admin@agentlink.org Neither the editors, authors, contributors, reviewers nor supporters accept any responsibility for loss or damage(More)
The family of terminological representation systems has its roots in the representation system kl-one. Since the development of this system more than a dozen similar representation systems have been developed by various research groups. These systems vary along a number of dimensions. In this paper, we present the results of an empirical analysis of six(More)
This paper motivates research into implementing nature-inspired algorithms in decentralised, asynchronous and parallel environments. These characteristics typify environments such as Peer-To-Peer systems, the Grid and autonomic computing which demand robustness, decentralisation, parallelism, asynchronicity and self-organisation. Nature-inspired systems(More)
Potential-based reward shaping can significantly improve the time needed to learn an optimal policy and, in multi-agent systems, the performance of the final joint-policy. It has been proven to not alter the optimal policy of an agent learning alone or the Nash equilibria of multiple agents learning together. However, a limitation of existing proofs is the(More)
We report on an investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems. These techniques are variants of Q-learning (Watkins, 1989) that are applicable to scenarios where mutual observation of actions is not possible. To date, reinforcement learning approaches for such independent agents did(More)
Potential-based reward shaping has previously been proven to both be equivalent to Q-table initialisation and guarantee policy invariance in single-agent reinforcement learning. The method has since been used in multi-agent reinforcement learning without consideration of whether the theoretical equivalence and guarantees hold. This paper extends the(More)