Multi-objective solution of optimal power flow based on TD3 deep reinforcement learning algorithm

Bowei Sun,Minggang Song,Ang Li,Nan Zou,Pengfei Pan,Xi Lu,Qun Yang,Hengrui Zhang,Xiangyu Kong

doi:10.1016/j.segan.2023.101054

Abstract

With the increasing proportion of renewable energy generator units in power grid, the research on how to coordinate and control the optimal power flow (OPF) problem of traditional power generation and renewable energy power generation is gradually being paid attention to. In this paper, considering the line constraints and generator set constraints, an OPF problem including thermal power units and renewable energy units is established, and multiple solution objectives are set. The goals include line safety, renewable energy output, unit generation capability and unit operating costs. As an off-policy learning strategy, TD3 reinforcement learning method outperforms other reinforcement learning methods in learning strategies which include dual critic networks, actor network delayed update, replay buffer sampling and target policy smoothing regularization. As a result, for this multi-objective solution problem, we adopt the TD3 reinforcement learning solution method and verify the effects through an example. It reveals in the experiments that TD3 learning method can achieve better results through algorithm comparison and line disconnection analysis. Algorithm analysis shows TD3 method can achieve faster convergence while the disconnection analysis shows the robustness of the proposed method. Through this solution method, a reasonable generator set output arrangement can be obtained. At the same time, in the case of using different weight coefficients, it will have different effects on the control strategy.

Full Text