Reward strategies for adaptive start-up scheduling of power plant

A Kamiya,S Kobayashi,K Kawai

doi:10.1109/icsmc.1997.633181

Abstract

Power plant start-up scheduling is aimed at minimizing the start-up time while limiting maximum turbine-rotor stresses. A shorter start-up time reduces fuel and electricity consumption during the start-up process and increases its adaptability to changes in electricity demand. Online start-up scheduling increases the flexibility of power plant operation. The start-up scheduling problem can be formulated as a combinatorial optimization problem with constraints. This problem has a number of local optima with a wide and high-dimension search space. The optimal schedule lies somewhere near the boundary of the feasible space. To achieve an efficient and robust search model, we propose the use of an enforcement operator to focus the search along the boundary and other local search strategies such as the reuse function and tabu search used in combination with genetic algorithms (GAs). We also propose integrating GAs with reinforcement learning. During the search process, GAs would guide the learning toward the promising areas. Reinforcement learning can generate a good schedule in the earlier stage of the search process. After learning representative optimal schedules, the search performance virtually satisfies the goal of this research: to search for optimal or near-optimal schedules in 30 seconds. For industrial use, the design of a reward strategy is crucial. We show that (a) positive rewards succeed with both low and high-dimension reinforcement-learning output, and (b) negative rewards succeed only with low-dimension output. We present our proposed model with analysis and test results.

Full Text