Abstract

We introduce a framework to leverage deep reinforcement learning (RL) for active sonar employment, wherein we train an RL agent to select waveform parameters, which maximize the probability of single-target detection. We first simulate raw sonar returns of targets and clutter in reverberation and noise using a physics-based sonar-simulation model, the Sonar Simulation Toolkit (SST), then process the resulting signatures into network inputs via an in-house signal and information processing model of an archetypal antisubmarine warfare (ASW) processing chain. We demonstrate that the trained RL agent is able to appropriately select between continuous wave (CW) and hyperbolic frequency modulated (HFM) waveforms depending on target trajectory, as well as select an optimal bandwidth and pulse length trade-off (when constrained by a constant time-bandwidth product), when presented with sonar returns from a reverb-limited or noise-limited environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call