Hierarchical Hybrid Multi-Agent Deep Reinforcement Learning for Peer-to-Peer Energy Trading Among Multiple Heterogeneous Microgrids

Yuxin Wu,Tianyang Zhao,Nian Liu,Min Liu,Haoyuan Yan

doi:10.1109/tsg.2023.3250321

Abstract

Peer-to-peer (P2P) energy trading among multi-microgrids has emerged as a promising paradigm to facilitate more efficient supply-demand balancing within local areas. However, existing works still exhibit limitations in terms of trading architecture and pricing schemes. In addition, the existing multi-agent deep reinforcement learning (MADRL) methods suffer from computational overload caused by the exploration of joint and hybrid action space during centralized training. In this paper, we propose a P2P energy trading paradigm based on hierarchical hybrid MADRL to maximize the trading profits among multiple heterogeneous MGs. First, we design a novel hierarchical structure of the MC agent to model the coupled interaction between flexible demands scheduling and autonomous quotation. Then, a P2P market that employs an improved mid-market rate (IMMR) pricing scheme is proposed to incentivize participation in local trading. Furthermore, to handle hybrid discrete-continuous action space and reduce computational complexity, we propose a hierarchical hybrid multi-agent double deep Q-network and deep deterministic policy gradient (hh-MADDQN-DDPG) algorithm to split the optimal policy learning-workload into a sequence of two sub-tasks. The DDQN for flexible demands scheduling and DDPG for energy trading. Numerical results of simulation I demonstrate that our hh-MADDQN-DDPG with IMMR increases 25% of the trading profits averaged over the baselines. Results of simulation II show that our hh-MADDQN-DDPG provides higher profits compared with the existing methods while maintaining better computational performance and scalability.

Full Text