Abstract

In view of the existence of remote sensing images with large variations in spatial resolution, small and dense objects, and the inability to determine the direction of motion, all these components make object detection from remote sensing images very challenging. In this paper, we propose a single-stage detection network based on YOLOv5. This method introduces the MS Transformer module at the end of the feature extraction network of the original network to enhance the feature extraction capability of the network model and integrates the Convolutional Block Attention Model (CBAM) to find the attention area in dense scenes. In addition, the YOLOv5 target detection network is improved by incorporating a rotation angle approach from the a priori frame design and the bounding box regression formulation to make it suitable for rotating frame-based detection scenarios. Finally, the weighted combination of the two difficult sample mining methods is used to improve the focal loss function, so as to improve the detection accuracy. The average accuracy of the test results of the improved algorithm on the DOTA data set is 77.01%, which is higher than the previous detection algorithm. Compared with the average detection accuracy of YOLOv5, the average detection accuracy is improved by 8.83%. The experimental results show that the algorithm has higher detection accuracy than other algorithms in remote sensing scenes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call