Abstract

Birds and experienced glider pilots frequently use atmospheric updrafts for long-distance flight and energy conservation, with harvested energy from updrafts serving as the foundation. Inspired by their common characteristics in autonomous soaring, a reinforcement learning algorithm, the Twin Delayed Deep Deterministic policy gradient, is used to investigate the optimal strategy for an unpowered glider to harvest energy from thermal updrafts. A round updraft model is utilized to characterize updrafts with varied strengths. A high-fidelity six-degree-of-glider model is used in the dynamic modeling of a glider. The results for various flight initial positions and updraft strengths demonstrate the effectiveness of the strategy learned via reinforcement learning. To enhance the updraft perception ability and expand the applicability of the trained glider agent, an extra wind velocity differential correction module is introduced to the algorithm, and a strategy symmetry method is applied. Comparison experiments regarding round updraft, the Gedeon thermal model, and Dryden continuous turbulence indicate the crucial role of the further optimized methods in improving the updraft-sensing ability of the reinforcement learning glider agent. With optimized methods, a glider trained in a simplified thermal updraft with a simple training method can have more effective flight strategies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.