The existing ripeness detection algorithm for strawberries suffers from low detection accuracy and high detection error rate. Considering these problems, we propose an improvement method based on YOLOv5, named MS-YOLOv5. The first step is to reconfigure the feature extraction network of MS-YOLOv5 by replacing the standard convolution with the depth hybrid deformable convolution (Ms-MDconv). In the second step, a double cooperative attention mechanism (Bc-attention) is constructed and implemented in the CSP2 module to improve the feature representation in complex environments. Finally, the Neck section of MS-YOLOv5 has been enhanced to use the fast-weighted fusion of cross-scale feature pyramid networks (FW-FPN) to replace the CSP2 module. It not only integrates multi-scale target features but also significantly reduces the number of parameters. The method was tested on the strawberry ripeness dataset, the mAP reached 0.956, the FPS reached 76, and the model size was 7.44M. The mAP and FPS are 8.4 and 1.3 percentage higher than the baseline network, respectively. The model size is reduced by 6.28M. This method is superior to mainstream algorithms in detection speed and accuracy. The system can accurately identify the ripeness of strawberries in complex environments, which could provide technical support for automated picking robots.
Read full abstract