This paper proposes a reinforcement learning-based formation-surrounding control method for multiple quadrotor unmanned aerial vehicles (UAVs) pursuit-evasion (MPE) games system subject to external disturbances. In the framework of the MPE games, the pursuers aim to equally surround the evaders which try to avoid being surrounded when forming the desired formation. By constructing position and attitude tracking error subsystems of quadrotor UAV, this paper proposes two control strategies which combines the feedforward control technique and reinforcement learning (RL) method. First, two novel cost functions are presented for the quadrotor UAV with external disturbances. Then, two control schemes based on RL have been developed to guarantee the stability of the tracking error subsystem. Subsequently, two critic-only neural networks (NN) weight update laws that only satisfy finite excitation conditions are proposed to estimate the optimal cost function. Furthermore, Nash equilibrium for multiple quadrotor UAVs is achieved by means of RL strategy to solve the Hamilton-Jacobi-Isaacs (HJI) equations. And the property of equally surrounding is proved for the first time by utilizing Euler's formula in this paper. Finally, the numerical simulation results are given to show the effectiveness and superior performance of the proposed control method.
Read full abstract