Heating, Ventilation, and Air Conditioning (HVAC) systems play a critical role in ensuring occupant comfort in buildings. Traditional Rule-Based Feedback Control (RBFC) systems, while widely deployed for their simplicity, suffer from low adaptability. Recent improvements, such as Model Predictive Control (MPC) methods demand complex mathematical modeling and substantial expert knowledge, creating a high barrier for system design and optimization. Reinforcement Learning (RL) emerges as a promising solution with its adaptability and model-free nature, albeit challenged by sample inefficiency and suboptimal convergence. Given the intrinsic delayed effects from previous actions and prolonged thermal inertia in the HVAC systems, this study introduces an innovative deep RL framework that can leverage historical observations to refine RL agent performance. By incorporating a state-of-the-art (SOTA) Transformer model, we capture the temporal patterns in HVAC data and build a more precise RL training environment. Implemented on high-resolution, real-world HVAC datasets, our framework showed superior performance in both HVAC system modeling and RL control performance. Specifically, compared to the two baseline models—Bidirectional Long Short-Term Memory (Bi-LSTM) and vanilla Transformer, our proposed RL environment model achieved an average of 30.5% and 35.8% prediction accuracy improvement, respectively. Moreover, our optimal past observable RL agent swiftly delivers 35.3% of electricity saving and 54.4% of thermal comfort improvement over traditional RBFC. These results show the effectiveness of the pioneering integration of extensive historical observations for HVAC operation optimization.
Read full abstract