Abstract

Challenges still exist in the task of object detection in remote sensing images with densely distributed objects due to large variation in scale and neglect of the relative position and correlation. To address these issues, a Correlation Learning Detector based on Transformer (CLT-Det) is proposed for detecting dense objects in remote sensing images. A Transformer Attention Module (TAM) is designed to improve the densely packed objects’ model representation ability by learning pixel-wise attention with Transformer. To alleviate the semantic gap caused by variations in scale, a Feature Refinement Module (FRM) is proposed by improving the multi-scale feature pyramid. A Correlation Transformer Module (CTM) is proposed to extract correlation information and encodes position information of dense objects’ features on the classification branch for fully utilizing the position information and correlation among objects. Extensive experiments compared with several state-of-art methods on two challenging remote sensing datasets, namely DOTA and HRSC2016, demonstrate that the proposed CLT-Det achieves promising and competitive performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call