Numerous missions in both civil and military fields involve the pursuit-evasion problem of vehicles. With vertical take-off and landing capability, the intelligent air–ground vehicle expands its feasible path to 3-D space, which has great advantages in the pursuit. This vehicle requires adequate path planning to obtain an optimal 3-D path and further improve the pursuit efficiency. The planning process of the air–ground vehicle currently faces the technical difficulties of acquiring the proper takeoff timing and position while optimizing the planning trajectory, especially in a complex environment with dense obstacles. To solve the above issues, a game-learning-based smooth path planning strategy for the intelligent air–ground vehicle considering mode switching is proposed in this article. First, a new reward function of the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> -learning algorithm, considering the influence of flight obstacle crossing parameters, is presented to explore the short forward track distance. Second, in the update rule, the pursuit-evasion game acts in the mode switching decisions. During interactive learning between the vehicle and environment, this game constantly updates the Nash equilibrium solutions for mode switching and gets a series of switching decisions of the pursuer vehicle (ego vehicle). Third, a double-yaw correction for path smoothing modification is proposed to reduce turning points and avoid local path deviations. This modification provides heuristic information for the exploration of the environment, which significantly speeds up the convergence speed of the algorithm. Finally, the proposed strategy is verified on a 1000 m*1000 m map with 0–200 m obstacle height. Results show that this strategy is effective to decrease the 253-m distance compared with the traditional reinforcement learning algorithm and has a faster convergence speed. The number of trajectory direction changes is 36% less than that of the game-learning algorithm only considering mode switching. The unreasonable large angle turns are eliminated.
Read full abstract