Abstract

Path planning is one of the most essential part in autonomous navigation. Most existing works suppose that the environment is static and fixed. However, path planning is widely used in random and dynamic environment (such as search and rescue, surveillance and other scenarios). In this paper, we propose a Deep Reinforcement Learning (DRL)-based method that enables unmanned aerial vehicles (UAVs) to execute navigation tasks in multi-obstacle environments with randomness and dynamics. The method is based on the Twin Delayed Deep Deterministic Policy Gradients (TD3) algorithm. In order to predict the impact of the environment on UAV, the change of environment observations is added into the Actor–Critic network input, and the two-stream Actor–Critic network structure is proposed to extract features of environment observations. Simulations are carried out to evaluate the performance of the algorithm and experiment results show that our method can enable the UAV to complete autonomous navigation tasks safely in multi-obstacle environments, which reflects the efficiency of our method. Moreover, compared to DDPG and the conventional TD3, our method has better generalization ability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call