Large amounts of distributed energy sources have brought challenges in terms of the safe and stable operation of the power grid. As a key technology between user-side energy resources and the distribution network (DN), how to realize the online coordinated scheduling of VPP and DN as well as the real-time response strategy of distributed equipment (DE) within VPP is the focus of this study. Thus, the hierarchical deep reinforcement learning (DRL) Hierarchical-TD3 algorithm is designed based on the unified modeling of adjustable space to achieve the real-time economic scheduling of VPPs. The upper layer DN considers the network security constraints and solves the economic scheduling model of VPPs based on the single-agent TD3 algorithm. Based on the scheduling instructions from the upper layer, the lower layer VPPs consider the requirements of privacy protection and control autonomy and realize real-time response of the DE within VPP via multi-agent MATD3 algorithm. Numerical results in the modified 33-nodes system show that the proposed Hierarchical-TD3 algorithm can achieve privacy protection and the coordinated scheduling of VPP and DE to reduce the operating cost. It differs from the optimal value by only 1.46% but can achieve online decision-making on the millisecond scale. Compared with the traditional centralized and decentralized DRL algorithms, the total cost is reduced by 10.15% and 5.52% respectively. Compared with the traditional soft-constraint method, there is no constraint violation during the training and testing phases. Finally, the actual 116-nodes testing system validates the scalability of the proposed Hierarchical-TD3 algorithm.
Read full abstract