Abstract
This article presents temporal-difference (TD) learning which is a combination of Monte Carlo and dynamic programing (DP) as a Method for controlling single-body wave energy converters (WECs). Since TD methods are designed to solve the prediction problems, we use this feature to maximize the energy captured from the sea waves. The entered force to the buoy system is addressed implicitly in the state matrix to design the problem into a TD framework. In order to enhance the captured power by the WEC, the control method is built to have an online active control. This will help the device to predict the best controller based on its previous experiences in the same situations. Two methods of TD, Q-Learning and SARSA, are used and the features are analyzed and several testing functions are carried out in simulation part. To perform on-line optimal control, a force control has acted as a controller and TD coefficients are tuned at a proper rate significantly after specific number of episodes. The power of suggested TD methods is compared to PGM, IPOPT and with other learning control strategies. Several computer simulations were carried out to evaluate the controller effectiveness by applying different sea-states and analyzing the resultant WEC dynamics.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have