Preventive Control Policy Construction in Active Distribution Network of Cyber-Physical System with Reinforcement Learning

Pengpeng Sun,Sen Yuan,Chong Wang,Yunwei Dong

doi:10.3390/app11010229

Pengpeng Sun, Sen Yuan + Show 2 more

Open Access

https://doi.org/10.3390/app11010229

Copy DOI

Journal: Applied sciences	Publication Date: Dec 29, 2020
Citations: 2	License type: CC BY 4.0

Affiliation: Northwestern Polytechnical University

Abstract

Once an active distribution network of a cyber-physical system is in alert state, it is vulnerable to cross-domain cascading failures. It is necessary to transit the state of an active distribution network of cyber-physical system from an alert state to a normal state using a preventive control policy against cross-domain cascading failures. In fact, it is difficult to construct and analyze a preventive control policy via theoretical analysis methods or physical experimental methods. The theoretical analysis methods may not be accurate due to approximated models, and the physical experimental methods are expensive and time consuming for building prototypes. This paper presents a preventive control policy construction method based on a deep deterministic policy gradient idea (shorted as PCMD) to generate and optimize a preventive control policy with Artificial Intelligence (AI) technologies. It adopts the reinforcement learning technique to make full use of the available historical data to overcome the problems of high cost and low accuracy. Firstly, a preventive control model is designed based on the finite automaton theory, which can guide the data collection and learning policy selection. The control model considers the voltage stability, frequency stability, current overload prevention, and the control cost reduction as a feedback variable, without the specific power flow equations and differential equations. Then, after enough training, a local optimal preventive control policy can be constructed under the comparability condition among a fitted action-value function and a fitted policy function. The constructed preventive control policy contains some control actions to achieve a low cost and in accord with the principle of shortening a cross-domain cascading failures propagation sequence as far as possible. The PCMD is more flexible and closer to reality than the theoretical analysis methods and has a lower cost than the physical experimental methods. To evaluate the performance of the proposed method, an experimental case study, China Electric Power Research-Cyber-Physical System (shorted as CEPR-CPS), which comes from China Electric Power Research Institute, is carried out. The result shows that the effectiveness of preventive control policy construction with the PCMD is better than most current methods, such as the multi-agent method in terms of reducing the number of failure nodes and avoiding the state space explosion.

Highlights

Once an active distribution network of a cyber-physical system is in alert state, it is vulnerable to cross-domain cascading failures
Comparison diagram of cascading failures (CCF) initiated from a node failure in the power network (PN) and in the communication network (CN) are shown in Figures 5 and 6, respectively
A preventive control model based on the finite automaton theory is designed, which is a six-tuple to describe preventive actions for blocking the propagation of cross-domain cascading failures in an active distribution network of cyber-physical system, and cross-domain cascading failures sequences should be as short as possible