Sonar image object detection is essential in underwater rescue and resource exploration. Although many convolution neural network (CNN)-based object detection algorithms have achieved great success in natural images. However, for underwater sonar images, problems, such as seabed reverberation noise interference, low proportion of foreground object region pixels, and poor imaging resolution, present considerable challenges to achieving accurate underwater object detection. To address these problems, we propose a novel sonar image object detector called the multilevel feature fusion network (MLFFNet). The detector consists of multiscale convolution module (MS-Conv), multilevel feature extraction module (ML-FEM), multilevel feature fusion module (ML-FFM), neighborhood channel attention mechanism (N-CAM), multiscale feature pyramid module (MS-FPN), and feature association module (FA). First, we use the MS-Conv to extract different scale feature information in the object region. Second, the ML-FEM and ML-FFM are used to obtain the local detail and global context features. Third, the N-CAM and MS-FPN are used to obtain the foreground objects’ semantic feature and position feature, and suppress the background region noise interference. Finally, we use the FA module to enhance the category and feature correlation of different objects. Extensive experiments are conducted on the real scene sonar image dataset. The experimental results demonstrate that MLFFNet performs better than other state-of-the-art object detection methods. Code and dataset are publicly at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/darkseid-arch/SonarMLFFNet</uri> .
Read full abstract