A reactive power optimization partially observable Markov decision process with data uncertainty using multi-agent actor-attention-critic algorithm

Yaru Gu,Xueliang Huang

doi:10.1016/j.ijepes.2022.108848

Abstract

We present an innovative partially observable Markov decision process (POMDP) modelling method for the reactive power optimization process of the active distribution network (ADN) under the high permeability of the distributed generation. This model is tolerant of the uncertainty resulting from data uncertainty. We believe that the belief state space in the POMDP model corresponds to the state space in the Markov decision process (MDP) model, and we apply the multi-agent actor-attention-critic (MAAC) reinforcement learning (RL) algorithm to the proposed model. This technique extracts the most effective information with the highest quality from the huge historical measurement database, hence enhancing the learning effectiveness of agents and the stability of the optimization strategy. We simulate reactive power optimization in a modified IEEE-33 nodes ADN and a modified IEEE-123nodes ADN. The simulation demonstrates the stability and economic superiority of the proposed approach under varying degrees of data uncertainty relative to previous RL algorithms based on the MDP model. The simulation demonstrates that the proposed POMDP model is more appropriate for the real operation of the partially observable distribution network than the MDP models. And the optimal strategy obtained by the proposed MAAC algorithm is reliable with deteriorating data quality.

Full Text