Despite advancements in underwater object detection (UOD) from optical underwater images in recent years, the task still poses significant challenges due to the chaotic underwater environment, as well as the substantial variations in scale and contour of objects. Existing deep learning-based schemes generally overlook the enhancement and refinement between multi-scale features of densely distributed underwater objects, leading to inaccurate localization and classification predictions with excessive information redundancy. To tackle the above issues, this article presents a novel feature boosting and differential pyramid network (FBDPN) for precise and efficient UOD. The salient properties of our paper are: (1) a heuristic feature pyramid network (FPN)-inspired architecture is constructed, which employs a convolutional neural network (CNN)-Transformer hybrid strategy to simultaneously facilitate the learning of multi-scale features and the capture of long-distance dependencies among pixels. (2) A neighborhood-scale feature boosting module (NSFBM) is developed to enhance contextual information between features of neighborhood scales. (3) A cross-scale feature differential module (CSFDM) is designed further to achieve effective information redundancy between features of different scales. Extensive experiments are conducted to reveal that our proposed FBDPN can outperform other state-of-the-art methods in both UOD performance and computational complexity. In addition, sufficient ablation studies are also performed to demonstrate the effectiveness of each component in our FBDPN. The source code is available at https://github.com/jixun-dmu/FBDPN.
Read full abstract