Coordinated Reinforcement Learning for Decentralized Optimal Control

  title={Coordinated Reinforcement Learning for Decentralized Optimal Control},
  author={Daniel Yagan and Chen-Khong Tham},
  journal={2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning},
We consider a multi-agent system where the overall performance is affected by the joint actions or policies of agents. However, each agent only observes a partial view of the global state condition. This model is known as a decentralized partially-observable Markov decision process (DEC-POMDP), which can be considered more applicable in real-world applications such as communication networks. It is known that the exact solution to a DEC-POMDP is NEXP-complete and memory requirements grow… CONTINUE READING