To address the issues of accurately identifying and tracking individual fish abnormal behaviors and poor adaptability in the aquaculture field, this paper proposes a Mask2former model combined with a bidirectional routing attention mechanism (BiFormer) and a multiscale dilated attention (MSDA) module for fish abnormal behavior recognition and segmentation. To compensate for the lack of publicly available datasets on fish abnormal behavior, we created the “FISH_segmentation_2023” abnormal behavior dataset, which includes four types of fish behaviors. First, by introducing the BiFormer attention mechanism, the model can better capture critical temporal and spatial information in image sequences, significantly enhancing feature representation. Second, after processing the feature maps with the pixel decoder, the MSDA module is introduced to perform multiscale fusion on these features. The fused features are then passed to the transformer decoder, further enhancing the model’s ability to recognize fish abnormal behaviors. Finally, to further improve model performance and address class imbalance issues in the dataset, we designed a composite loss function combining focal loss and dice loss (FD loss). This loss function can balance the influence of easy and difficult‐to‐classify samples while optimizing segmentation performance, thereby improving the model’s recognition accuracy and mean intersection over union (mIoU) metrics. Experimental results show that the BiFormer multiscale dilated attention FD loss (BMF)‐Mask2former model exhibits high performance, achieving average intersection over union (IoU), accuracy, and recall values of 92.33%, 95.63%, and 94.82%, respectively, on the self‐built FISH_segmentation_2023 dataset, representing improvements of 6.10%, 4.50%, and 5.09%, respectively, compared to the Mask2former model. The study demonstrates that the proposed model can accurately capture both local and contextual features of fish abnormal behaviors through multiscale fusion methods, resulting in high‐quality segmentation outcomes.
Read full abstract