Recognizing ripe tomatoes is a crucial aspect of tomato picking. To ensure the accuracy of inspection results, You Only Look Once version 9 (YOLOv9) has been explored as a fruit detection algorithm. To tackle the challenge of identifying tomatoes and the low accuracy of small object detection in complex environments, we propose a ripe tomato recognition algorithm based on an enhanced YOLOv9-C model. After collecting tomato data, we used Mosaic for data augmentation, which improved model robustness and enriched experimental data. Improvements were made to the feature extraction and down-sampling modules, integrating HGBlock and SPD-ADown modules into the YOLOv9 model. These measures resulted in high detection performance with precision and recall rates of 97.2% and 92.3% in horizontal and vertical experimental comparisons, respectively. The module-integrated model improved accuracy and recall by 1.3% and 1.1%, respectively, and also reduced inference time by 1 ms compared to the original model. The inference time of this model was 14.7 ms, which is 16 ms better than the RetinaNet model. This model was tested accurately with mAP@0.5 (%) up to 98%, which is 9.6% higher than RetinaNet. Its increased speed and accuracy make it more suitable for practical applications. Overall, this model provides a reliable technique for recognizing ripe tomatoes during the picking process.
Read full abstract