Dynamic Markov Decision Process Research Articles

This paper proposes a novel optimal control method for multi-zone HVAC systems to enhance energy efficiency and improve occupant comfort. To address the disturbances in outdoor weather and indoor loads, the proposed method formulates the HVAC control problem as a dynamic Markov decision process, and employs deep reinforcement learning (DRL) techniques to obtain the optimal control policy. To manage multiple goals among thermal comfort, indoor air quality (IAQ) and system energy efficiency, a preference-inspired (P-ins) mechanism is designed to achieve the optimal balance among different objectives. The P-ins mechanism effectively guides the agent towards the optimal control policy with high convergence rate. The proposed method has been validated through EnergyPlus-Python co-simulation testbed with real-world data traces, and assessed by an overall evaluation indicator. Results demonstrate that the proposed method can reduce energy consumption without compromising thermal comfort and IAQ. Specifically, the occurrence of temperature violations is reduced below 0.8 %, and a maximum energy saving of 9.41 % can be achieved, compared with traditional methods.

The bang-bang relays of the multiple-boiler system (MBS) control, are characterized by complex limiter saturation functions and classified as fixed parameters. Their action signals cannot precisely control the nonlinear dynamic building heating demand over their entire range of operation. Moreover, in a mono-boiler system, the bang-bang controller endures increasing short cycling over partial load time due to the heating system being considered to have an oversized boiler at most times of running, thus promoting high energy consumption and fluctuating indoor thermal comfort. So, it is difficult to cope with uncertainties in outdoor environments and indoor heating load. Hence, this study formulates the MBS control problem as a dynamic Markov decision process and applies a deep clustering of reinforcement learning approach to obtain the optimal control policy through interaction with the environment based on multi-agent learning according to bang-bang action. With such an approach, adopting a new boiler sequencing control (BSC) strategy using deep clustering of reinforcement learning based on a bang-bang (DCRLBB) manner. The deep clustering is configured to break Lagrangian trajectory curves into piecewise segments to represent the RL agent's action policy. The agent's action policy signals are configured from the bang-bang reward formula based on trade-off implications to be more adjustable than traditional fixed parameters such as fuzzy bang-bang controller (FBBC). The agent of BSC significantly affects the energy performance of the MBS, whereas the other agent resizes boiler capacity by acting to adjust the boiler solenoid fuel valve. The comparison of results between the proposed strategy and conventional FBBC shows distinct differences in the superior response of DCRLBB under dynamic indoor/outdoor actual conditions and energy saving by more than 32% while maintaining the indoor thermal in the comfortable range.

Dynamic Markov Decision Process Research Articles

Articles published on Dynamic Markov Decision Process

Energy and comfort aware operation of multi-zone HVAC system through preference-inspired deep reinforcement learning

Deep clustering of reinforcement learning based on the bang-bang principle to optimize the energy in multi-boiler for intelligent buildings

Stochastic resource scheduling via bilayer dynamic Markov decision process in mobile cloud networks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dynamic Markov Decision Process Research Articles

Articles published on Dynamic Markov Decision Process

Energy and comfort aware operation of multi-zone HVAC system through preference-inspired deep reinforcement learning

Deep clustering of reinforcement learning based on the bang-bang principle to optimize the energy in multi-boiler for intelligent buildings

Stochastic resource scheduling via bilayer dynamic Markov decision process in mobile cloud networks