• Corpus ID: 238407987

Cooperative Multi-Agent Actor-Critic for Privacy-Preserving Load Scheduling in a Residential Microgrid

@article{Qin2021CooperativeMA,
  title={Cooperative Multi-Agent Actor-Critic for Privacy-Preserving Load Scheduling in a Residential Microgrid},
  author={Zhaoming Qin and Nanqing Dong and Eric P. Xing and Junwei Cao},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.02784}
}
As a scalable data-driven approach, multi-agent reinforcement learning (MARL) has made remarkable advances in solving the cooperative residential load scheduling problems. However, the common centralized training strategy of MARL algorithms raises privacy risks for involved households. In this work, we propose a privacy-preserving multi-agent actor-critic framework where the decentralized actors are trained with distributed critics, such that both the decentralized execution and the distributed… 

References

SHOWING 1-10 OF 40 REFERENCES

Demand-Side Scheduling Based on Multi-Agent Deep Actor-Critic Learning for Smart Grids

  • Joash LeeWenbo WangD. Niyato
  • Computer Science, Engineering
    2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
  • 2020
Simulation results show that the online deep reinforcement learning method can reduce both the peak-to-average ratio of total energy consumed and the cost of electricity for all households based purely on instantaneous observations and a price signal.

Privacy Preserving Load Control of Residential Microgrid via Deep Reinforcement Learning

A novel deep reinforcement learning algorithm is developed that integrates recurrent neural network to accommodate the partial observability of state due to privacy issues and demonstrates the superiority and flexibility of the developed algorithm compared with prior privacy preserving load control method.

Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning

The key motivation is that credit assignment among agents may not require an explicit formulation as long as the policy gradients derived from a centralized critic carry sufficient information for the decentralized agents to maximize their joint action value through optimal cooperation.

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Value-Decomposition Networks For Cooperative Multi-Agent Learning

This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.

Actor-Attention-Critic for Multi-Agent Reinforcement Learning

This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.

Distributed Deep Reinforcement Learning for Intelligent Load Scheduling in Residential Smart Grids

This article employs a model-free method for the households, which works with limited information about the uncertain factors, and adopts a method based on distributed deep reinforcement learning to search for the Nash equilibrium in a noncooperative stochastic game.

DOP: Off-Policy Multi-Agent Decomposed Policy Gradients

This paper investigates causes that hinder the performance of MAPG algorithms and presents a multi-agent decomposed policy gradient method (DOP), which introduces the idea of value function decomposition into the multi- agent actor-critic framework and formally shows that DOP critics have sufficient representational capability to guarantee convergence.

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.

Model-Free Real-Time Autonomous Control for a Residential Multi-Energy System Using Deep Reinforcement Learning

A novel real-time autonomous energy management strategy for a residential MES is proposed using a model-free deep reinforcement learning (DRL) based approach, combining state-of-the-art deep deterministic policy gradient (DDPG) method with an innovative prioritized experience replay strategy.