Abstract

Underwater passive acoustic source detection and tracking is important for various marine applications, including marine mammal monitoring and naval surveillance. The performance in these applications is dependent on the placement and operation of sensing assets, such as autonomous underwater vehicles. Conventionally, these decisions have been made by human operators aided by acoustic propagation modelling tools, situational and environmental data, and experience. However, this is time-consuming and computationally expensive. We consider a ‘toy problem’ of a single autonomous vehicle (agent) in search of a stationary source of low frequency within a reinforcement learning (RL) architecture. We initially choose the observation space to be the agent’s current position. The agent is allowed to explore the environment with a limited action space, taking equal distance steps in one of $n$ directions. Rewards are received for positive detections of the source. Using OpenAI’s PPO algorithm an increase in median episode reward of approximately 20 points in the RL environment developed is seen when the agent is given a history of it’s previous moves and signal-to-noise ratio compared to the simple state. The future expansion of the RL framework is discussed in terms of the observation and action spaces, reward function and RL architecture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call