Abstract

We investigate the application of a Deep Reinforcement Learning (DRL) method for demand responsive closed-loop scheduling of continuous process/energy systems. The method employed is the Soft Actor-Critic (SAC), an actor-critic, off-policy, stochastic method with built-in entropy maximization that balances exploration and exploitation. Considering energy systems, which are typically characterized by the presence of hybrid (combined discrete-continuous) actions originating from equipment operating ranges and discrete actuators, we demonstrate the main ways in which hybrid actions can be incorporated into the SAC framework. A unified treatment is presented, in which five different approaches for modeling hybrid actions are compared: two considering deterministic discrete decisions (DSReL and Softmax), and three considering stochastic discrete decisions (Q-enumeration, Gumbel-Softmax reparameterization, and Score Function gradient estimator). It is shown that DSReL and Q-enumeration have a better overall performance for the considered environment. Next, the developed hybrid-SAC method is applied to the operation of process/energy systems under day-ahead electricity prices and demand forecast. A case-study of a large-scale District Cooling plant employing real demand and price data is presented. It is shown that the algorithm can quickly avoid constraint violations, and continuously improves toward the optimal solution. Lastly, an analysis of demand forecast uncertainty shows that the hybrid-SAC algorithm can robustly handle state uncertainty and works well for partially observable systems with incomplete state information.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call