Application of a Gradient Descent Continuous Actor-Critic Algorithm for Double-Side Day-Ahead Electricity Market Modeling

Huiru Zhao,Chao Zhang,Sen Guo,Mingrui Zhao,Yuwei Wang

doi:10.3390/en9090725

Huiru Zhao, Chao Zhang + Show 3 more

Open Access

https://doi.org/10.3390/en9090725

Copy DOI

Journal: Energies	Publication Date: Sep 9, 2016
Citations: 12	License type: CC BY 4.0

Affiliation: North China Electric Power University

Abstract

An important goal of China’s electric power system reform is to create a double-side day-ahead wholesale electricity market in the future, where the suppliers (represented by GenCOs) and demanders (represented by DisCOs) compete simultaneously with each other in one market. Therefore, modeling and simulating the dynamic bidding process and the equilibrium in the double-side day-ahead electricity market scientifically is not only important to some developed countries, but also to China to provide a bidding decision-making tool to help GenCOs and DisCOs obtain more profits in market competition. Meanwhile, it can also provide an economic analysis tool to help government officials design the proper market mechanisms and policies. The traditional dynamic game model and table-based reinforcement learning algorithm have already been employed in the day-ahead electricity market modeling. However, those models are based on some assumptions, such as taking the probability distribution function of market clearing price (MCP) and each rival’s bidding strategy as common knowledge (in dynamic game market models), and assuming the discrete state and action sets of every agent (in table-based reinforcement learning market models), which are no longer applicable in a realistic situation. In this paper, a modified reinforcement learning method, called gradient descent continuous Actor-Critic (GDCAC) algorithm was employed in the double-side day-ahead electricity market modeling and simulation. This algorithm can not only get rid of the abovementioned unrealistic assumptions, but also cope with the Markov decision-making process with continuous state and action sets just like the real electricity market. Meanwhile, the time complexity of our proposed model is only O(n). The simulation result of employing the proposed model in the double-side day-ahead electricity market shows the superiority of our approach in terms of participant’s profit or social welfare compared with traditional reinforcement learning methods.

Highlights

In order to meet the economic and social development need of an effective power supply, besides the continuous power system construction, the electricity industry in China has undergone a series of restructuring and changes in the last decades, similar to many other countries around the world
In order to solve the issue mentioned in the last paragraph of Section 2, we proposed a modified reinforcement learning algorithm, namely the gradient descent continuous Actor-Critic (GDCAC) algorithm
China is experiencing a new round of electricity market reforms, and the double-side day-ahead electricity market will become more and more important in China’s energy trading area in the future

Summary

Introduction

In order to meet the economic and social development need of an effective power supply, besides the continuous power system construction, the electricity industry in China has undergone a series of restructuring and changes in the last decades, similar to many other countries around the world. The direct objective of the electricity market restructuring in many countries, including China, is to enhance the competition and improve the operational efficiency [2]. The agent complies withwhich the policy by the actor to generate an action. With the the policy maintained by thefor actor to generate an action. In applyingfeedback the action on theand environment, the the critic is responsible for the receiving environmental immediate reward updates value function.

Objectives

Methods

Conclusion