With the price of green energy now more reasonable, users can now produce enough electricity to meet their needs and make a profit by selling the surplus on the underground P2P energy market. For households, energy trading and demand management can reduce electricity costs. However, consumers generally obtain market offers based on their expectations and the forecasts of other households. However, the P2P exchange system is not able to quantify the gap between these offers and the best market. The objective of this paper is to apply deep reinforcement learning techniques to optimal energy trading and demand response (DR) methods within a peer-to-peer (P2P) market. The main objective is to maximize cost reductions. The best approach to achieve this objective was investigated as part of this project. The complexity of domestic energy trading and energy recovery is formally characterized as a partially observable Markov decision process (POMDP). Through decentralized training and performance-based learning, the strategy maximizes policy and value functions. In order to identify the most effective proactive solutions, a comparative analysis is carried out between the two parties. Based on the simulation results, it was observed that implementing the recommended reinforcement learning strategy to optimize peer-to-peer (P2P) energy exchange can lead to a significant improvement in the average household reward. Specifically, the average household reward can be increased by 7.6% and 12.08% by employing the aforementioned approach.
Read full abstract