Abstract For trajectory planning in a three-dimensional continuous space with multiple complex constraints, traditional optimal control methods face significant challenges, such as long computation times and poor dynamic performance. Using deep reinforcement learning algorithms for training agents in flight trajectory planning can effectively improve the dynamic performance of the planning algorithms. This paper focuses on training agents for UAV emergency landing and obstacle avoidance trajectory planning based on deep reinforcement learning algorithms. Within the reinforcement learning framework, we design the training environment, state space, action space, reward function, and neural network structure. To address common issues in reinforcement learning, such as poor generalization and slow training speed, we improve the training process using the concept of curriculum learning. Comparisons with pseudospectral methods demonstrate that deep reinforcement learning can effectively handle trajectory planning tasks in real flight environments, achieving trajectory quality close to that of pseudospectral methods while offering superior real-time performance.
Read full abstract