The demand for harvesting green citrus has increased due to the rapid growth of the green citrus industry. However, real-time detection of green citrus is challenging because of its high visual similarity to the background. To address this issue, a new model called EDGC-YOLO (Efficient Detection of Green Citrus based on the YOLO) has been developed, focusing on detecting green citrus effectively and accurately. Specifically, the proposed model uses a data-driven approach that employs the highest confidence level from the aspect ratio of unobstructed citrus fruits to infer anchor boxes of obscured citrus. This method ensures a high degree of coincidence with bounding boxes based on empirical annotation. Furthermore, analyzing the distribution range of various citrus annotation categories within the dataset and the incorporating the highest confidence level of the normal distribution regarding aspect ratios, the initial cluster centers of the BGMM (Bayesian Gaussian Mixture Model) are dynamically adjusted. This strategy enables the anchor boxes to be inferred from the best suited for detecting green citrus. These improved anchor boxes provide a more accurate and effective solution for centroid positioning of obscured green citrus, significantly enhancing the entire network model’s detection accuracy and lightweight efficiency. To further improve detection accuracy and reduce model size, this study integrated the Refined-EfficientNetV2 network as the backbone, enhanced with a Convolutional Block Attention Module (CBAM) for better feature extraction. This integration allows the model to effectively capture relevant channel information and spatial features, increasing the feature extraction capability for green citrus images. Experimental results demonstrate the model’s superior performance: parameters reduced to 4.52 million, computational demand to 7.8 GFLOPs (64.4 % and 49.4 % of the original model, respectively), and model size decreased to 9.4 MB (65.3 % of the original). Additionally, the optimized model improved Precision (P), Recall (R), and mean Average Precision (mAP) by 0.5 %, 1.6 %, and 3.0 %, respectively, compared to the original model. The proposed model achieves higher detection accuracy with a smaller model size, providing theoretical support for the harvesting decision-making of green citrus harvesting robots.
Read full abstract