Abstract

The maintenance scheduling problem of windfarms is a recently arisen research topic, which contains uncertain factors introduced by weather conditions. However, most existing methods cannot generate maintenance schedules dynamically according to stochastic weather conditions. This paper formulates the maintenance scheduling problem as a Markov decision process (MDP). The Soft Actor-Critic (SAC) method is used to solve the MDP that has an extremely large state space. SAC is an off-policy deep reinforcement learning algorithm that considers entropy regularization during action selection. This mechanism accelerates the training process of the agent and prevents premature convergence to a local optimum solution. Numerical examples are used to verify the performance of SAC in maintenance scheduling. Result shows that the proposed method can obtain higher total production than the deep Q network and the genetic algorithm when the stochastic wind speed is considered.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call