It is necessary for China to accelerate the low-carbon power transition to realize the carbon neutrality target by 2060. Based on the goal of carbon neutrality, Deep Reinforcement Learning with the Double Q-learning method is used to solve a multi-agent game in 30 provinces of China, in an attempt to explore the evolutionary process of regional low-carbon power transition strategies, carbon emissions, and carbon emission decoupling with economic growth. Conclusions are obtained as follows. First, regional low-carbon power transition strategies are not consistent. Though China is bound to achieve the carbon peak and carbon neutrality, the carbon peak and carbon neutrality process will vary widely across regions and over time. Second, the cost of power decarbonization is higher in resource-based regions than that in core regions. The regional coordination policy can reduce the total carbon emissions in resource-based regions, but it will not alleviate their carbon emission decoupling costs. Third, the stable strategy of low-carbon power transition needs faster economic development and more abundant zero-carbon resources. Therefore, the carbon neutrality policy is suggested to be more spatially differentiated, more equitable, and more robust.
Read full abstract