Abstract

Tomato yield estimation relies significantly on the accurate detection of fruit quantity and size. And object detection and semantic segmentation of fruits emerge as efficacious methodologies for the realization of fruit counting and size detection. To address challenges associated with the detection and segmentation of tomato fruits in complex environments such as sample imbalance of different classes, small targets, and susceptibility to occlusion at varying stages of ripeness, this study proposed a foreground-foreground class balance method and an improved YOLOv8s network, NVW-YOLOv8s, for detecting and segmenting tomato fruits simultaneously. The foreground-foreground class balance method initially performed pixel-wise extraction on fruit samples with fewer instances. Subsequently, it synthesized them with original images containing a limited number of samples from the class, thereby augmenting the overall quantity of this specific category of fruit samples. In the NVW-YOLOv8s network, a C2f-N module, founded on the normalization-based attention module (NAM), was specifically crafted for residual feature learning. This design serves to augment the network's proficiency in extracting and integrating feature information pertaining to tomato fruits within intricate environments. Additionally, a variable focal loss (VFL) was introduced as the classification loss function to address the issue of positive and negative sample imbalance, and a regression loss function based on Wise-IoU (WIoU) was incorporated to tackle challenges related to fruit small targets and susceptibility to occlusion. The YOLOv8s models trained with the augmented balanced dataset revealed that the model's detection and segmentation performance, as indicated by mAP@0.5, improved by 4.8 % and 5.4 %, respectively, compared with the model trained with the augmented original training dataset. The test results on the augmented balanced dataset indicated that the proposed NVW-YOLOv8s network achieved a detection mAP@0.5 of 91.4 %, an F1-Score of 85.4 %, a segmentation mAP@0.5 of 90.7 %, and an F1-Score of 84.8 %. These results surpass the baseline YOLOv8s network by 4.3 % and 5.5 % for detection, 4.1 % and 5.0 % for segmentation, respectively. Additionally, the processing for concurrent detection and segmentation was measured at a speed of 60.2 FPS. Therefore, the proposed method has successfully met the precision and real-time requirements for intelligent yield estimation in horticultural crops.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.