LifeGuard: An Improvement of Actor-Critic Model with Collision Predictor in Autonomous UAV Navigation

Manit Chansuparp,Kulsawasd Jitkajornwanich

doi:10.1080/08839514.2022.2137632

Manit Chansuparp, Kulsawasd Jitkajornwanich

https://doi.org/10.1080/08839514.2022.2137632

Copy DOI

Abstract

ABSTRACT The needs for autonomous unmanned aerial vehicle navigation (AUN) have been emerging for recent years due to the growth of the logistic industry and the need for social distancing during the pandemic. There have been different methods trying to overcome the AUN task, and most of them have focused on deep reinforcement learning (DRL). But the results were still far from satisfactory, and even if the result was good, the environment was usually too trivial and simple. We report in this paper one of the causes of low success rate for AUN in our previous work, which is the apprehensive behavior of agents. After numerous episodes of training, when the agent faces risky scenes, it often moves back and forth repeatedly until running out of the limited steps. Hence, in this paper, we propose a new role, LifeGuard, into the popular DRL model, Actor-Critic, to tackle the apprehensive behavior and expect a better success rate. In addition, we developed a pilot method of unsupervised classification for sequential data to further enhance our reward function from previous work, augmentative backward reward function. The experimental results demonstrated that the proposed method can eliminate the apprehensive behavior and gain higher success rates than the state-of-the-art method, FORK, with lesser effort.

Full Text