Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance

Amala Sonny,Sreenivasa Reddy Yeduri,Linga Reddy Cenkeramaddi

doi:10.1016/j.asoc.2023.110773

Amala Sonny, Sreenivasa Reddy Yeduri + Show 1 more

Open Access

https://doi.org/10.1016/j.asoc.2023.110773

Copy DOI

Journal: Applied Soft Computing	Publication Date: Aug 31, 2023
Citations: 8	License type: cc-by

Affiliation: University of Agder

Abstract

Recently, unmanned aerial vehicles (UAVs) have shown promising results for autonomous sensing. UAVs have been deployed for multiple applications that include surveillance, mapping, tracking, and search operations. Finding an efficient path between a source and a goal is a critical issue that has been the focus of recent exploration. Many path-planning algorithms are utilized to find an efficient path for a UAV to navigate from a source to a goal with obstacle avoidance. Despite the extensive literature and numerous research proposals for path planning, dynamic obstacle avoidance has not been addressed with machine learning. When the obstacles are dynamic, i.e., they can change their position over time, and the constraints of the path planning algorithm become more challenging. This in turn adds a layer of complexity to the path planning algorithm. To address this challenge, a Q-learning algorithm is proposed in this work to facilitate efficient path planning for UAVs with both static and dynamic obstacle avoidance. We introduced the Shortest Distance Prioritization policy in the learning process which marginally reduces the distance that the UAV has to travel to reach the goal. Further, the proposed Q-learning algorithm adopts a grid-graph-based method to solve the path-planning problem. It learns to maximize the reward based on the agent’s behavior in the environment. Through results, the performance comparison between the proposed approach and state-of-the-art path planning approaches such as A-star, Dijkstra, and Sarsa algorithms are evaluated in terms of learning time and path length. We show through results that the proposed approach results in improved performance when compared to state-of-the-art approaches. Further, the effect of an increased number of obstacles are evaluated on the performance of the proposed approach.

Full Text