This study addresses the problem of detecting occluded apples in complex unstructured environments in orchards and proposes an apple detection and segmentation model based on improved YOLOv8n-SGW-YOLOv8n. The model improves apple detection and segmentation by combining the SPD-Conv convolution module, the GAM global attention mechanism, and the Wise-IoU loss function, which enhances the accuracy and robustness. The SPD-Conv module preserves fine-grained features in the image by converting spatial information into channel information, which is particularly suitable for small target detection. The GAM global attention mechanism enhances the recognition of occluded targets by strengthening the feature representation of channel and spatial dimensions. The Wise-IoU loss function further optimises the regression accuracy of the target frame. Finally, the pre-prepared dataset is used for model training and validation. The results show that the SGW-YOLOv8n model significantly improves relative to the original YOLOv8n in target detection and instance segmentation tasks, especially in occlusion scenes. The model improves the detection mAP to 75.9% and the segmentation mAP to 75.7% and maintains a processing speed of 44.37 FPS, which can meet the real-time requirements, providing effective technical support for the detection and segmentation of fruits in complex unstructured environments for fruit harvesting robots.
Read full abstract