In this article, a learning-based trajectory generation framework is proposed for quadrotors, which guarantees real-time, efficient, and practice-reliable navigation by online making human-like decisions via reinforcement learning (RL) and imitation learning (IL). Specifically, inspired by human driving behavior and the perception range of sensors, a real-time local planner is designed by combining learning and optimization techniques, where the smooth and flexible trajectories are online planned efficiently in the observable area. In particular, the key problems in the framework, temporal optimality (time allocation), and spatial optimality (trajectory distribution) are solved by designing an RL policy, which provides human-like commands in real-time (e.g., slower or faster) to achieve better navigation, instead of generating traditional low-level motions. In this manner, real-time trajectories are calculated using convex optimization according to the efficient and accurate decisions of the RL policy. In addition, to improve generalization performance and to accelerate the training, an expert policy and IL are employed in the framework. Compared with existing works, the kernel contribution is to design a real-time practice-oriented intelligent trajectory generation framework for quadrotors, where human-like decision-making and model-based optimization are integrated to plan high-quality trajectories. The results of comparative experiments in known and unknown environments illustrate the superior performance of the proposed trajectory generation strategy in terms of efficiency, smoothness, and flexibility.
Read full abstract