Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

  title={Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings},
  author={Liang Yu and Yi Sun and Zhanbo Xu and Chao Shen and Dong Yue and Tao Jiang and Xiaohong Guan},
  journal={IEEE Transactions on Smart Grid},
In commercial buildings, about 40%–50% of the total electricity consumption is attributed to Heating, Ventilation, and Air Conditioning (HVAC) systems, which places an economic burden on building operators. In this paper, we intend to minimize the energy cost of an HVAC system in a multi-zone commercial building with the consideration of random zone occupancy, thermal comfort, and indoor air quality comfort. Due to the existence of unknown thermal dynamics models, parameter uncertainties (e.g… 

Figures and Tables from this paper

A Review of Deep Reinforcement Learning for Smart Building Energy Management
A comprehensive review of DRL for SBEM from the perspective of system scale is provided and the existing unresolved issues are identified and possible future research directions are pointed out.
Deep Reinforcement Learning for Smart Building Energy Management: A Survey
This paper presents a comprehensive literature review on DRL for smart building energy management (SBEM) and introduces the fundamentals of DRL and provides the classification of D RL methods used in existing works related to SBEM.
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning
Reinforcement learning (RL) has received significant interest in recent years, due primarily to the successes of deep reinforcement learning at solving many challenging tasks such as playing Chess,
Model-Free Feedback Constrained Optimization Via Projected Primal-Dual Zeroth-Order Dynamics
In this paper, we propose a model-free feedback solution method to solve generic constrained optimization problems, without knowing the specific formulations of the objective and constraint
Real-Time Construction of Thermal Model Based on Multimodal Scene Data
In commercial buildings, the total consumption of central air conditioning accounts for about 40%–50%. However, at present, the initial design value of building Heating Ventilation and Air
Learning Efficient Dynamic Controller for HVAC System
People employ a variety of devices to achieve comfort in various aspects of their lives; for example, numerous types of air conditioners are used to maintain a pleasant indoor temperature. The wide


A. and Q
Energy Optimization of HVAC Systems in Commercial Buildings Considering Indoor Air Quality Management
A real-time algorithm based on the framework of Lyapunov optimization techniques is proposed to construct virtual queues related to indoor temperatures and stabilize such queues so that indoor temperatures fluctuate around the ideal time-average indoor temperature.
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
Transactive Control of Commercial Buildings for Demand Response
Transactive control is a type of distributed control strategy that uses market mechanisms to engage self-interested responsive loads to achieve power balance in the electrical power grid. In this
and M
  • Berges, “Gnu-RL: A precocial reinforcement learning solution for building HVAC control using a differentiable MPC policy,” in Proc. BuildSys
  • 2019
and K
  • Tomsovic, “Community microgrid scheduling considering building thermal dynamics,” in Proc. IEEE Power Energy Soc. Gen. Meeting
  • 2017
Distributed Real-Time HVAC Control for Cost-Efficient Commercial Buildings Under Smart Grid Environment
A real-time HVAC control algorithm based on the framework of Lyapunov optimization techniques without the need to predict any system parameters and know their stochastic information is proposed for minimizing the long-term total cost.
A Data-Driven Multi-Agent Autonomous Voltage Control Framework Using Deep Reinforcement Learning
A multi-agent AVC (MA-AVC) algorithm based on a multi- agent deep deterministic policy gradient (MADDPG) method that features centralized training and decentralized execution is developed to solve the AVC problem.
Learning-Automata-Based Confident Information Coverage Barriers for Smart Ocean Internet of Things
A novel and widely adopted confident information coverage model is adopted as the fundamental coverage model and the CIC barrier path construction (CICBC) problem is formulated with the goals of maximizing the number of barrier paths and minimizing the amount of IoT nodes in each barrier path.
Real-Time Residential Demand Response
In the proposed approach, an approximate optimal policy based on neural network is designed to learn the optimal DR scheduling strategy and can directly learn from high-dimensional sensory data of the appliance states, real-time electricity price, and outdoor temperature.