Underwater fish image segmentation is a crucial technique in marine fish monitoring. However, typical underwater fish images often suffer from issues such as color distortion, low contrast, and blurriness, primarily due to the complex and dynamic nature of the marine environment. To enhance the accuracy of underwater fish image segmentation, this paper introduces an innovative neural network model that combines the attention mechanism with a feature pyramid module. After the backbone network processes the input image through convolution, the data pass through the enhanced feature pyramid module, where it is iteratively processed by multiple weighted branches. Unlike conventional methods, the multi-scale feature extraction module that we designed not only improves the extraction of high-level semantic features but also optimizes the distribution of low-level shape feature weights through the synergistic interactions of the branches, all while preserving the inherent properties of the image. This novel architecture significantly boosts segmentation accuracy, offering a new solution for fish image segmentation tasks. To further enhance the model’s robustness, the Mix-up and CutMix data augmentation techniques were employed. The model was validated using the Fish4Knowledge dataset, and the experimental results demonstrate that the model achieves a Mean Intersection over Union (MIoU) of 95.1%, with improvements of 1.3%, 1.5%, and 1.7% in the MIoU, Mean Pixel Accuracy (PA), and F1 score, respectively, compared to traditional segmentation methods. Additionally, a real fish image dataset captured in deep-sea environments was constructed to verify the practical applicability of the proposed algorithm.
Read full abstract