Abstract

In this work, a pitch controller of a wind turbine (WT) inspired by reinforcement learning (RL) is designed and implemented. The control system consists of a state estimator, a reward strategy, a policy table, and a policy update algorithm. Novel reward strategies related to the energy deviation from the rated power are defined. They are designed to improve the efficiency of the WT. Two new categories of reward strategies are proposed: “only positive” (O-P) and “positive-negative” (P-N) rewards. The relationship of these categories with the exploration-exploitation dilemma, the use of ϵ-greedy methods and the learning convergence are also introduced and linked to the WT control problem. In addition, an extensive analysis of the influence of the different rewards in the controller performance and in the learning speed is carried out. The controller is compared with a proportional-integral-derivative (PID) regulator for the same small wind turbine, obtaining better results. The simulations show how the P-N rewards improve the performance of the controller, stabilize the output power around the rated power, and reduce the error over time.

Highlights

  • Wind energy gains strength year after year

  • This may even be more critical for floating offshore wind turbines (FOWT), as it has been proved that the control system can affect the stability of the floating device [3,4]

  • The controller is composed by a state estimator, a policy update algorithm, a reward strategy, and an actuator

Read more

Summary

Introduction

This control system pitches the blades usually a few degrees every time the wind changes in order to keep the rotor blades at the required angle, controlling the rotational speed of the turbine [5,6] This is not a trivial task due to the non-linearity of the equations that describe its dynamics, the coupling between the internal variables, and uncertainty that comes from external loads [7], mainly wind and, in the case of FOWT, waves and currents, that makes its dynamics changes [8]. In the literature, the RL approach has been applied to the different control actions of wind turbines or to other related problems with successful results This learning strategy has not been directly applied to the pitch control, neither the reward mechanisms have been analyzed in order to improve the control performance.

Wind Turbine Model Description
RL-Inspired Controller
Exploring Reward Strategies
Simulation Results and Discussion
Influence of the Reward Window
Influence of the SizeSize of the
16. Evolution
19. Evolution
Conclusions and Future Works
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call