Object recognition and localization play a crucial role in aerial images and their applications. The aerial images are challenging due to the large aspect ratio, arbitrary orientation, variation in scales, and non-uniform and cluttered object distribution. To address these challenges, we propose an efficient and robust model called the Simultaneous Object Localization and Recognition Network (SOLARNet), which is a fusion network that integrates two different sub-networks: PixelAttentionDetector (PD) and RotationDetector (RD). The PD considers features from different scales and cluttered objects, while RD handles rotation invariance, giving horizontal and oriented object detection results. The state-of-the-art model fails to improve accuracy when images are adversarially attacked. SOLARNet is not only efficient in terms of accuracy but also robust concerning to Fast Gradient Sign Method (FGSM), Basic Iterative Method (BIM), Jacobian Based Saliency Map (JSM) adversarial attacks, which is the crucial factor for any mission-critical system. We have performed experiments and achieved the accuracy on the publicly available DOTA dataset (75.00% mAP, 66.40% mAP) and DIOR dataset (88.60% mAP, 81.50% mAP) for horizontal and oriented object recognition tasks respectively while having high inference speed. Qualitative and quantitative results reported in this paper substantiate the superiority of SOLARNet over other state-of-the-art methodologies.
Read full abstract