Precise localization of unmanned underwater vehicle (UUV) on multi-beam forward-looking sonar (MFLS) is a key technology in underwater robotic exploration. However, large appearance change and weak features of targets in MFLS image, pose a serious challenge on target detection. This paper proposes an efficient UUV detection model with an enhanced training scheme and convolutional block attention mechanism. The proposed training scheme is composed of dual-path gradient information optimization and adaptive loss adjustment, which improves the network’s detection capabilities and the underwater environment adaptability. The convolutional block attention module (CBAM) is developed for extracting channel and spatially weighted feature maps, and providing a mechanism to adaptive feature refinement. We construct a comprehensive sonar detection dataset from field experiments. By comparing existing state-of-the-art detection algorithms, our proposed UUVDNet outperforms Faster R-CNN, Cascade R-CNN, CenterNet, SSD, YOLOv8n, DPFIN, and MBSNN in terms of mAP50–95 by 39.5%, 35.7%, 8.4%, 14.0%, 2.5%, 13.6%, and 21.8%, respectively. Furthermore, our inference model size is the most compact among all detection models, with a size of 5.1M. Final ablation experiments effectively showcase the proposed components’ efficiency.
Read full abstract