Abstract
Remote sensing object detection has been an important and challenging research hotspot in computer vision that is widely used in military and civilian fields. Recently, the combined detection model of CNN and Transformer has achieved good results, but the problem of poor detection performance of small objects still needs to be solved urgently. This letter proposes a deformable DETR-based framework for object detection in remote sensing images. Firstly, Multi-Scale Split Attention (MSSA) is designed to extract more detailed feature information by grouping. Next, we propose Multi-Scale Deformable Prescreening Attention (MSDPA) mechanism in decoding layer, which achieves the purpose of pre-screening, so that the encoder-decoder structure can obtain attention map more efficiently. Finally, the A-D loss function is applied to the prediction layer, increasing the attention of small objects and optimizing the IOU function. We conduct extensive experiments on the DOTA v1.5 dataset and the HRRSD dataset, which show that the reconstructed detection model is more suitable for remote sensing objects, especially for small objects. The average detection accuracy in DOTA dataset has improved by 4.4% (up to 75.6%), especially the accuracy of small objects has raised by 5%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.