Target localization using Multi-Agent Deep Reinforcement Learning with Proximal Policy Optimization

Ahmed Alagha,Shakti Singh,Rabeb Mizouni,Jamal Bentahar,Hadi Otrok

doi:10.1016/j.future.2022.06.015

Abstract

Target localization refers to identifying a target location based on sensory data readings gathered by sensing agents (robots, UAVs), surveying a certain area of interest. Existing solutions either rely on estimating the target location through fusion and analysis of the collected sensory data, or on pre-defined and data-driven survey paths. However, the adaptability of such methods remains an issue, as increasing the complexity and the dynamicity of the environment requires further re-modeling and supervision. As an efficient and adaptable approach to obtain localization agents, this work proposes several Multi-Agent Deep Reinforcement Learning (MDRL) models to tackle the target localization problem in multi-agent systems. The use of Reinforcement Learning (RL) helps in providing an efficient Artificial Intelligence (AI) paradigm to obtain intelligent agents, which can learn in different complex environments. In this work, an actor–critic structure is used with Convolutional Neural Networks (CNNs), which are optimized using Proximal Policy Optimization (PPO). Agents’ observations are modeled as 2D heatmaps capturing locations and sensor readings of all agents. Cooperation among agents is induced using a team-based reward, which incentivizes agents to cooperate in localizing the target and managing their resources. Scalability with the number of agents is ensured through the use of a Centralized Learning for Decentralized Execution approach, while scalability with the observation size is achieved through image downsampling and Gaussian filters. The efficiency of the proposed models is validated and further benchmarked against existing target localization methods, through experiments on single- and multi-agent systems, for tasks pertaining to radioactive target localization.

Full Text