In the context of the global response to climate change and the promotion of an energy transition, the Internet of Things (IoT), sensor technologies, and big data analytics have been increasingly used in power systems, contributing to the rapid development of distributed energy resources. The integration of a large number of distributed energy resources has led to issues, such as increased volatility and uncertainty in distribution networks, large-scale data, and the complexity and challenges of optimizing security and economic dispatch strategies. To address these problems, this paper proposes a day-ahead scheduling method for distribution networks based on an improved multi-agent proximal policy optimization (MAPPO) reinforcement learning algorithm. This method achieves the coordinated scheduling of multiple types of distributed resources within the distribution network environment, promoting effective interactions between the distributed resources and the grid and coordination among the resources. Firstly, the operational framework and principles of the proposed algorithm are described. To avoid blind trial-and-error and instability in the convergence process during learning, a generalized advantage estimation (GAE) function is introduced to improve the multi-agent proximal policy optimization algorithm, enhancing the stability of policy updates and the speed of convergence during training. Secondly, a day-ahead scheduling model for the power distribution grid containing multiple types of distributed resources is constructed, and based on this model, the environment, actions, states, and reward function are designed. Finally, the effectiveness of the proposed method in solving the day-ahead economic dispatch problem for distribution grids is verified using an improved IEEE 30-bus system example.
Read full abstract