Abstract
ABSTRACT Accurately detecting small vehicle targets consisting of only a few pixels in remote sensing images (RSIs) remains highly challenging. General vehicle detection networks primarily design complex deep neural networks to learn the features of small targets in remote sensing images. However, while enhancing accuracy, this approach results in an increase in the model’s size. To tackle these issues, this paper presents the Restructuring Architecture – You Only Look Once (RA-YOLO) network based on network reconfiguration. By integrating the CA mechanism and adding the small target detection layer (STDL), enhances the extraction of shallow features of small targets. In addition, the feature fusion part in the middle of the original PAFPN layer is rewritten as a Lite-Asymptotic Feature Pyramid Network (L-AFPN) module, which not only enhances the extraction of shallow features of small targets, but also improves the multi-scale feature fusion capability. The results indicate that the RA-YOLO enhances the mean Average Precision (mAP) by 10.1%, 2% and 7.5%, respectively, on the remote sensing datasets DIOR, NWPU VHR-10 and DOTAv1.5-small vehicle. Moreover, the model parameters are reduced from 7.02 M to 5.22 M. Our proposed model shows a good trade-off between accuracy and model size compared with the state-of-the-art models.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have