Deep reinforcement learning based trajectory optimization for magnetometer-mounted UAV to landmine detection

Ahmed Barnawi,Krishan Kumar,Neeraj Kumar,Bander Alzahrani,Ishan Budhiraja,Amal Almansour

doi:10.1016/j.comcom.2022.09.002

Abstract

Unmanned aerial vehicles (UAVs) have emerged as a viable choice for data collection and landmine (LM) detection. The LM buried under the dirt or sand is detected using a UAV-mounted magnetometer in this paper. A UAV is deployed to gather data along the intended route when the magnetometer receives a signal from the LMs. During a whole round of data collection, we want to reduce the total energy consumption of the UAV-Magnetometer-LM system. To do this, we turn the energy consumption reduction issue into a limited combinatorial optimization problem by concurrently picking time slots and arranging the UAV’s visitation sequence to identify the LM. The problem of minimizing energy usage is NP-hard, making it difficult to solve optimally. In order to tackle this challenge, we used the deep reinforcement learning (DRL) based deep deterministic policy gradient (DDPG) scheme. DDPG is used to enhance the convergence speed and eliminate redundant computations. Furthermore, to improve the detection in real-time, we proposed the proximal online policy technique (POPT). Numerical results demonstrate that the proposed scheme consumes 37.14%, 31.25%, and 21.42% better results than synthetic aperture radar (SAR), convolution neural network (CNN), and double deep recurrent Q-network (DDRQN).

Full Text