Abstract

Peer-to-peer (P2P) energy trading among multi-microgrids has emerged as a promising paradigm to facilitate more efficient supply-demand balancing within local areas. However, existing works still exhibit limitations in terms of trading architecture and pricing schemes. In addition, the existing multi-agent deep reinforcement learning (MADRL) methods suffer from computational overload caused by the exploration of joint and hybrid action space during centralized training. In this paper, we propose a P2P energy trading paradigm based on hierarchical hybrid MADRL to maximize the trading profits among multiple heterogeneous MGs. First, we design a novel hierarchical structure of the MC agent to model the coupled interaction between flexible demands scheduling and autonomous quotation. Then, a P2P market that employs an improved mid-market rate (IMMR) pricing scheme is proposed to incentivize participation in local trading. Furthermore, to handle hybrid discrete-continuous action space and reduce computational complexity, we propose a hierarchical hybrid multi-agent double deep Q-network and deep deterministic policy gradient (hh-MADDQN-DDPG) algorithm to split the optimal policy learning-workload into a sequence of two sub-tasks. The DDQN for flexible demands scheduling and DDPG for energy trading. Numerical results of simulation I demonstrate that our hh-MADDQN-DDPG with IMMR increases 25% of the trading profits averaged over the baselines. Results of simulation II show that our hh-MADDQN-DDPG provides higher profits compared with the existing methods while maintaining better computational performance and scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call