This paper proposes a bi-level optimal control strategy for thermally coupled multi-zone dedicated outside air system (DOAS)-assisted HVAC systems, which integrates the deep reinforcement learning (DRL) algorithm with Lyapunov function-based reward shaping (LRS) and an augmented genetic algorithm. The proposed strategy aims to maintain indoor air quality (IAQ) and thermal comfort while reducing the overall energy cost. On the upper level, the LRS method enhances the convergence of the DRL algorithm, ensuring satisfactory IAQ with reduced electricity cost. On the lower level, the embedding of the pre-trained optimal agent achieves a unidirectional decoupling, which mitigates model dependency and accurately control thermal comfort under thermal coupling. The proposed strategy offers the following advantages: a) It can efficiently maintain IAQ in multiple zones and high energy efficiency by optimizing the supply airflow, in response to real-time environmental changes. b) It can achieve satisfactory thermal comfort by regulating the fan coil unit power, while implicitly reducing the overall energy costs via unidirectional decoupling. c) Experimental results demonstrate that the proposed strategy achieves a maximum energy savings of 33.13 % and electricity savings of 34.14 %, compared with the traditional strategy. d) The proposed strategy exhibits good generalization ability under different test climates and occupancy scenarios.
Read full abstract