Accurate and rapid segmentation of key parts of frozen tuna, along with precise pose estimation, is crucial for automated processing. However, challenges such as size differences and indistinct features of tuna parts, as well as the complexity of determining fish poses in multi-fish scenarios, hinder this process. To address these issues, this paper introduces TunaVision, a vision model based on YOLOv8 designed for automated tuna processing. TunaVision incorporates enhancements in instance segmentation through YOLOv8m-FusionSeg, improving the segmentation of small and complex targets by increasing channel depth and optimizing feature fusion. Additionally, the YOLOv8s RSF model improves feature extraction speed and accuracy, ensuring each fish is correctly identified and localized before segmentation and pose estimation. Furthermore, TunaVision employs a vector-based approach for pose estimation, utilizing detection and segmentation results to determine fish posture and orientation. Experiments show that YOLOv8m-FusionSeg achieves an mAP@0.5 of 93.3%, while YOLOv8s RSF achieves an mAP@0.5 of 96.1%, with a mean absolute error (MAE) of 1.81 degrees in angle estimation, significantly outperforming other methods. These findings highlight TunaVision's effectiveness in segmenting, detecting, and estimating poses of frozen tuna, offering valuable insights for the development of automated processing systems.
Read full abstract