Smart environmental monitoring has gained prominence, where target localization is of utmost importance. Employing UAVs for localization tasks is appealing owing to their low-cost, lightweight, and high maneuverability. However, UAVs lack the autonomy of decision-making if met with uncertain situations. Therefore, reinforcement learning (RL) can introduce intelligence to UAVs, where they learn to act based on the presented situation. Existing works focus on UAV trajectory optimization, navigation, and target tracking. These methods are application-specific and cannot be adapted to localization tasks since they require prior knowledge of the target. Moreover, the current RL-based autonomous target localization systems are lacking since- 1) they must keep track of all visited locations and their corresponding readings, 2) they require retraining when encountering new environments, and 3) they are not scalable since the agent's movement is limited to slow speeds and for specific environments. Therefore, this work proposes a data-driven UAV target localization system based on Q-learning, which employs tabular methods to learn the optimal policy. Deep Q-network (DQN) is introduced to enhance the RL model and alleviate the curse of dimensionality. The proposed models enable smart decision-making, where the sensory information gathered by the UAV is exploited to produce the best action. Moreover, the UAV movement is modeled based on motion physics, where the actions correspond to linear velocities and heading angles. The proposed approach is compared with different benchmarks, where the results indicate that a more efficient, scalable, and adaptable localization is achieved, irrespective of the environment or source characteristics, without retraining.
Read full abstract