Safe Off-Policy Deep Reinforcement Learning Algorithm for Volt-VAR Control in Power Distribution Systems

Wei Wang,Yuanqi Gao,Jie Shi,Nanpeng Yu

doi:10.1109/tsg.2019.2962625

Wei Wang, Yuanqi Gao + Show 2 more

Open Access

https://doi.org/10.1109/tsg.2019.2962625

Copy DOI

Journal: IEEE Transactions on Smart Grid	Publication Date: Jul 1, 2020
Citations: 224	License type: publisher-specific, author manuscript

Affiliation: University of California, Riverside

Abstract

Volt-VAR control is critical to keeping distribution network voltages within allowable range, minimizing losses, and reducing wear and tear of voltage regulating devices. To deal with incomplete and inaccurate distribution network models, we propose a safe off-policy deep reinforcement learning algorithm to solve Volt-VAR control problems in a model-free manner. The Volt-VAR control problem is formulated as a constrained Markov decision process with discrete action space, and solved by our proposed constrained soft actor-critic algorithm. Our proposed reinforcement learning algorithm achieves scalability, sample efficiency, and constraint satisfaction by synergistically combining the merits of the maximum-entropy framework, the method of multiplier, a device-decoupled neural network structure, and an ordinal encoding scheme. Comprehensive numerical studies with the IEEE distribution test feeders show that our proposed algorithm outperforms the existing reinforcement learning algorithms and conventional optimization-based approaches on a large feeder.

Full Text