Abstract

Wilderness Search and Rescue (WiSAR) requires navigating large regions - often in rugged, remote areas - searching for missing people or animals. Because of the large regions and potentially limited mobility of ground vehicles, WiSAR missions are frequently carried out with the help of Unmanned Aerial Vehicles (UAVs). However, the ability to autonomously execute WiSAR remains an unsolved challenge. In this paper, we take advantage of Deep Reinforcement Learning (DRL) to develop an autonomous WiSAR controller for UAVs. We improve the learning and understanding of a UAV agent to explore a partially observable environment in search of a victim trapped in the wild. The proposed approach breaks up this difficult problem into 4 sub-tasks: tractable mapping of the environment in small regions, region selection, target search, and region exploration. Quad-Tree is utilized offline to decompose the environment map into smaller, tractable maps. Then, an efficient cost function is repeatedly computed to determine the best target region to search in each iteration of the process. Recurrent-DDQN and A2C algorithms are trained to generate optimal policies for the target search and regions exploration tasks, respectively. We tested our approach against a baseline of a hard-coded policy of navigating the map in a zigzag fashion and another baseline of using the same sub-tasks but instead of using the DRL algorithms, randomly selecting an action at each time step. The results demonstrate that our proposed approach is capable of navigating through 25 randomly generated environments and finding the missing victim faster than the baselines by 46%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call