Abstract

Applications of multiple unmanned aerial vehicles (UAV) involve complex control dynamics for accomplishing any task. This paper employs a multi-UAV system for continuous tracking and end-to-end coverage of a moving convoy of vehicles to provide security and surveillance cover. The coverage is achieved by maintaining the moving convoy within the overlapping Field-of-Views (FoVs) of the UAVs. To learn the controls of the autonomous multi-UAV system, we propose a deep reinforcement learning based multi-agent actor-critic method called GPR-MADDPG. The proposed method makes use of Gaussian Process Regression (GPR) to estimate an unbiased and stable target value of the critic. Further, the kernel function of the GPR model has been adapted to keep the high variance in the convoy trajectory in check. The rewards for training the multi-UAV system are formulated to maximize the end-to-end convoy coverage by optimizing the overlaps between the FoVs along with minimizing the tracking error. Experiments were performed on real-world road trajectories of varying complexities along with varying convoy speeds and the number of UAVs. Further tests were performed using a simulator with a real-world physics engine. The experiments show that the proposed GPR-MADDPG model results in the least amount of overlapping error and accumulates maximum reward as compared to other prevalent approaches in the literature.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.