Abstract

ABSTRACT In the realm of object detection from high-resolution remote sensing images (HRRSIs), the existing YOLOv5 methods encounter several challenges, including dense object arrangements, small object sizes, and complex backgrounds. To tackle these challenges, we propose a novel approach called C3TB-YOLOv5, which combines traditional YOLOv5 with the Transformer model to detect objects in HRRSIs. Unlike conventional YOLOv5 methods that primarily focus on capturing local information from remote sensing scenes, our C3TB-YOLOv5 method incorporates global information through the introduction of a new C3TB module. This module, based on the Transformer multi-head attention mechanism (AM), consists of two branches that extract local and global information from feature maps. By integrating these branches and establishing long-range relationships, our method successfully detects densely arranged small objects in HRRSIs. Furthermore, to improve the accuracy of tiny object detection, a novel detection head has been developed to effectively utilize the unused C3 module, thereby preventing the loss of fine-grained textures and positional features. In addition, we integrate an enhanced SimAM, namely Sim-GMP, into the model to adjust the focus across varying regions, effectively distinguishing the features of interested objects from complex backgrounds. Finally, to address the problem of sample imbalance in remote sensing object detection, the most recent Wise-IoU v3 loss function is employed to improve the accuracy of anchor box predictions for objects. To maintain a high object detection speed, the most critical C3 modules are substituted with the proposed C3TB module for the purpose of striking a good balance between object detection accuracy and model lightweight. Extensive experiments conducted on two remote sensing datasets of NWPU VHR-10 and VisDrone 2019 demonstrates that our method achieves superior object detection performance than state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call