Abstract

The problem of multi-agent remote sensing for the purposes of finding survivors or surveying points of interest in GPS-denied and partially observable environments remains a challenge. This paper presents a framework for multi-agent target-finding using a combination of online POMDP based planning and Deep Reinforcement Learning based control. The framework is implemented considering planning and control as two separate problems. The planning problem is defined as a decentralised multi-agent graph search problem and is solved using a modern online POMDP solver. The control problem is defined as a local continuous-environment exploration problem and is solved using modern Deep Reinforcement Learning techniques. The proposed framework combines the solution to both of these problems and testing shows that it enables multiple agents to find a target within large, simulated test environments in the presence of unknown obstacles and obstructions. The proposed approach could also be extended or adapted to a number of time sensitive remote-sensing problems, from searching for multiple survivors during a disaster to surveying points of interest in a hazardous environment by adjusting the individual model definitions.

Highlights

  • In recent years the use of Unmanned Aerial Vehicles (UAVs) has been broadly explored for a number of applications, both in consumer and industrial operation environments

  • This paper maintains a focus on the problem of target-finding using multiple UAV agents in partially observable environments.The solution presented in this paper can be expanded and adapted as necessary to a number of remote sensing problem spaces, from searching for survivors in a variety of environments such as disaster zones, buildings, caves systems, and open or forested areas, to surveying potentially hazardous or difficult to reach points of interest

  • This paper presents a multi-agent target-finding framework, using the Robotic Operation System 2 (ROS2) platform [26], to search partially observable occupancy map style environments using multiple simulated UAV agents

Read more

Summary

Introduction

In recent years the use of Unmanned Aerial Vehicles (UAVs) has been broadly explored for a number of applications, both in consumer and industrial operation environments. The application of such techniques to control the exploration of multiple UAV agents in partially observable and hazardous environments remains unrepresented. This paper maintains a focus on the problem of target-finding using multiple UAV agents in partially observable environments.The solution presented in this paper can be expanded and adapted as necessary to a number of remote sensing problem spaces, from searching for survivors in a variety of environments such as disaster zones, buildings, caves systems, and open or forested areas, to surveying potentially hazardous or difficult to reach points of interest. Secondary contributions include modelling of both the multi-agent planning problem for use with the TAPIR POMDP software package and the local control model in the form of an open-ai gym environment that uses occupancy-grid style maps

Background
Problem Definition
Framework Definition
Environment Definition
Decentralised Multi-Agent Planner
Local Control Policy
Open AI Gym Environment Definition
Defining The Simulated Agent
Training and Using the Policy
Software Architecture
Experimental Results
Testing the Local Controller
Testing the Full Framework
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call