An optimal energy scheduling strategy for integrated energy systems (IESs) can effectively improve the energy utilization efficiency and reduce carbon emissions. Due to the large-scale state space of IES caused by uncertain factors, it would be beneficial for the model training process to formulate a reasonable state-space representation. Thus, a condition knowledge representation and feedback learning framework based on contrastive reinforcement learning is designed in this study. Considering that different state conditions would bring inconsistent daily economic costs, a dynamic optimization model based on deterministic deep policy gradient is established, so that the condition samples can be partitioned according to the preoptimized daily costs. In order to represent the overall conditions on a daily basis and constrain the uncertain states in the IES environment, the state-space representation is constructed by a contrastive network considering the time dependence of variables. A Monte-Carlo policy gradient-based learning architecture is further proposed to optimize the condition partition and improve the policy learning performance. To verify the effectiveness of the proposed method, typical load operation scenarios of an IES are used in our simulations. The human experience strategies and state-of-the-art approaches are selected for comparisons. The results validate the advantages of the proposed approach in terms of cost effectiveness and ability to adapt in uncertain environments.
Read full abstract