Abstract

Path planning is one of the key technologies for autonomous flight of Unmanned Aerial Vehicle. Traditional path planning algorithms have some limitations and deficiencies in the complex and dynamic environment. In this article, we propose a deep reinforcement learning approach for three-dimensional path planning by utilizing the local information and relative distance without global information. UAV can obtain the limited environmental information nearby in the actual scenario with limited sensor capabilities. Therefore, path planning can be formulated as a Partially Observable Markov Decision Process. The recurrent neural network with temporal memory is constructed to address the partial observability problem by extracting crucial information from historical state-action sequences. We develop an action selection strategy that combines the current reward value and the state-action value to reduce the meaningless exploration. In addition, we construct two sample memory pools and propose an adaptive experience replay mechanism based on the frequency of failure. The simulation experiment results show that our method has significant improvements over Deep Q-Network and Deep Recurrent Q-Network in terms of stability and learning efficiency. Our approach successfully plans a reasonable three-dimensional path in the large-scale and complex environment, and has the perfect ability to avoid obstacles.in the unknown environment.

Highlights

  • The unmanned aerial vehicle (UAV) has attracted wide attention in both military and civilian fields because of low cost, flexibility and small size, et [1], [2]

  • We propose a deep reinforcement learning approach to solve the problem of the UAV path planning in the complex and dynamic environment

  • The main contributions of this article are summarized as follows: 1) We propose a new action selection strategy by combining the current reward R value and the Q value, which addresses the problem of inaccurate prediction of the neural network at the early stage of training

Read more

Summary

INTRODUCTION

The unmanned aerial vehicle (UAV) has attracted wide attention in both military and civilian fields because of low cost, flexibility and small size, et [1], [2]. R. Xie et al.: UAV Path Planning Algorithm Based on Deep Reinforcement Learning in Large-Scale and Dynamic Environments of different methods can make use of the advantages of each algorithm [15]–[17]. Reinforcement learning essentially obtains the mapping relationship from state to action, which does not involve a complex search process in the decision-making process, so it is suitable for UAV path planning that requires real-time decision-making. There are some challenges of path planning in the large-scale and dynamic environment: 1) The enormous number of states makes the neural network learning slowly and converging difficultly. We propose a deep reinforcement learning approach to solve the problem of the UAV path planning in the complex and dynamic environment.

CONSTRUCTION OF THE ALGORITHM
REWARD DESIGN
IMPROVED ACTION SELECTION STATEGY
ADAPTIVE SAMPING MECHANISM
COMPARISON OF ALGORITHM PERFORMANCE IN A STATIC SCENARIO
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.