Abstract
Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.
Highlights
Target detection is a basic task in computer vision and helps estimate the category of objects in a scene and mark their locations
The DOTA dataset [57] comprises 2806 aerial images obtained from different sensors and platforms, including 15 classification categories: plane (PL), baseball diamond (BD), bridge (BR), ground track (GTF), small vehicle (SV), large vehicle (LV), ship (SH), tennis court (TC), basketball court (BC), oil storage tank (ST), football field (SBF), roundabout (RA), airport and helipad (HA), swimming pool (SP), and helicopter (HC)
We focus on the interval between 0.6 and 0.9, where the recall rate is concentrated
Summary
Target detection is a basic task in computer vision and helps estimate the category of objects in a scene and mark their locations. Object detection in remote sensing images remains a challenging task. Research on remote sensing images has crucial applications in the military, disaster control, environmental management, and transportation planning [1,2,3,4]. It has attracted significant attention from researchers in recent years. Object detection in aerial images has become a prevalent topic in computer vision [5,6,7].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.