Motion-Primitive based Deep Reinforcement Learning for High Speed Aerospace Vehicle Missions

Winston C Levin,Kris Ezra,Sean M Nolan,Kyle Williams,Julie J Parish,Ali K Raz

doi:10.2514/6.2023-2667

Winston C Levin, Kris Ezra + Show 4 more

Open Access

https://doi.org/10.2514/6.2023-2667

Copy DOI

Abstract

Motion primitives (MPs) provide a fundamental abstraction of movement templates that can be used to guide and navigate a complex environment while simplifying the movement actions. These MPs, when utilized as an action space in reinforcement learning (RL), can allow an agent to learn to select a sequence of simple actions to guide a vehicle towards desired complex mission outcomes. This is particularly useful for missions involving high speed aerospace vehicles (HSAVs) (i.e., Mach 1 to 30) where near real time trajectory generation is needed but the computational cost and timeliness of trajectory generation remains prohibitive. This paper demonstrates that when MPs are employed in conjunction with RL, the agent can learn to solve a wider range of problems for HSAV missions. To this end, using both a MP and and non-MP approach, RL is employed to solve the problem of an HSAV arriving at a non-maneuvering moving target at a constant altitude and with an arbitrary, but constant, velocity and heading angle. The MPs for HSAV consist of multiple pull (flight path angle) and turn (heading angle) commands that are defined for a specific duration based on mission phases; whereas the non-MP approach uses angle of attack and bank angle as action space for RL. The paper describes details on HSAV problem formulation to include equations of motion, observation space, telescopic reward function, RL algorithm and hyperparameters, RL curriculum, formation of the MPs, and calculation of time to execute the MP used for the problem. Our results demonstrate that the non-MP approach is unable to even train an agent that is successful in the base-case of the RL curriculum. The MP approach, however, can train an agent with success rate of 76.6% in arriving at a target moving with any heading angle with a velocity between 0 and 500 m/s.

Full Text