Abstract

A novel morphing strategy based on reinforcement learning (RL) is developed to solve the morphing decision-making problem with minimum flight time for a long-range variable-sweep morphing aircraft. The proposed morphing strategy focuses on the sparse-reward no-reference decision-making problem caused by terminal performance objectives and long-range missions. A double-layer morphing-flight control framework is established to decouple the design of morphing strategy from flight controller while ensuring flight stability. Under this framework, an RL agent is designed to learn the minimum-flight-time morphing strategy. Specifically, the reward function is divided into primary goal rewards and sub-goal rewards to deal with the sparse-reward no-reference issue. A multi-stage progressive training scheme is developed to train the designed RL agent with a trail of training environments gradually converging to the actual world. This scheme accelerates the training process and promotes the convergence of the RL agent during training. Simulation results in nominal and dispersed conditions indicate the optimality and robustness of the proposed morphing strategy. Moreover, the generalization ability is validated in an untrained scenario.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call