Abstract

With the emergence of the new generation vision architecture Vmamba and the further demand for agricultural yield and efficiency, we propose an efficient and high-accuracy target detection network for automated pear picking tasks based on Vmamba, aiming to address the issue of low efficiency in current Transformer architectures. The proposed network, named SRSMamba, employs a Reward and Punishment Mechanism (RPM) to focus on important information while minimizing redundancy interference. It utilizes 3D Selective Scan (SS3D) to extend scanning dimensions and integrates global information across channel dimensions, thereby enhancing the model's robustness in complex agricultural environments and effectively adapting to the extraction of complex features in pear orchards and farmlands. Additionally, a Stacked Feature Pyramid Network (SFPN) is introduced to enhance semantic information during the feature fusion stage, particularly improving the detection capability for small targets. Experimental results show that SRSMamba has a low parameter count of 21.1 M, GFLOPs of 50.4, mAP of 72.0%, mAP50 reaching 94.8%, mAP75 at 68.1%, and FPS at 26.9. Compared with other state-of-the-art (SOTA) object detection methods, it achieves a good trade-off between model efficiency and detection accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.