Abstract
The use of Deep Reinforcement Learning (DRL) in building energy management is often hampered by data efficiency and computational challenges. The long training time, unstable, and potentially harmful control performance limit DRL’s adaptability and practicality in building control applications. To address these issues, this study introduces a new method, called Generative Adversarial Imitation Learning (GAIL), which effectively utilizes expert knowledge and demonstrations. Expert demonstrations range from fine-tuned rule-based controls to strategies inspired by optimization algorithms. By combining the capabilities of the generative adversarial network and imitation learning, GAIL is known for effectively learning the optimal strategy from expert demonstrations through an adversarial training process. We conducted a comprehensive evaluation comparing GAIL’s performance with the DRL algorithm Proximal Policy Optimization (PPO) in the scenario of controlling a variable air volume system for load shifting in commercial buildings. Impressively, GAIL, guided by expert demonstrations based on model predictive control, achieved significantly improved computational efficiency and effectiveness. In terms of unified cumulative reward, GAIL with data augmentation achieved 95% expert performance, 22% higher than baseline rule-based control, in 100 training epochs; GAIL also outperformed PPO by 7%, resulting in 2% lower energy costs and notably improved thermal comfort. This improvement in thermal comfort is evidenced by a reduction of 18.65 unmet degree hours during the one-week operation. In comparison, PPO requires more training time and still lags behind GAIL in cumulative reward even after 500 epochs. These findings highlight the advantages of GAIL in enabling faster learning with fewer training samples, resulting in cost-effective solutions due to lower computational requirements. Overall, GAIL presents a promising approach to building energy management and provides a practical and flexible solution to the shortcomings of learning-based controllers that require extensive computational resources and training time.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.