This study proposes a novel lightweight grape detection method. First, the backbone network of our method is Uniformer, which captures long-range dependencies and further improves the feature extraction capability. Then, a Bi-directional Path Aggregation Network (BiPANet) is presented to fuse low-resolution feature maps with strong semantic information and high-resolution feature maps with detailed information. BiPANet is constructed by introducing a novel cross-layer feature enhancement strategy into the Path Aggregation Network, which fuses more feature information with a significant reduction in the number of parameters and computational complexity. To improve the localization accuracy of the optimal bounding boxes, a Reposition Non-Maximum Suppression (R-NMS) algorithm is further proposed in post-processing. The algorithm performs repositioning operations on the optimal bounding boxes by using the position information of the bounding boxes around the optimal bounding boxes. Experiments on the WGISD show that our method achieves 87.7% mAP, 88.6% precision, 78.3% recall, 83.1% F1 score, and 46 FPS. Compared with YOLOx, YOLOv4, YOLOv3, Faster R-CNN, SSD, and RetinaNet, the mAP of our method is increased by 0.8%, 1.7%, 3.5%, 21.4%, 2.5%, and 13.3%, respectively, and the FPS of our method is increased by 2, 8, 2, 26, 0, and 10, respectively. Similar conclusions can be obtained on another grape dataset. Encouraging experimental results show that our method can achieve better performance than other recognized detection methods in the grape detection tasks.
Read full abstract