Abstract

The detection of arbitrary-oriented and multi-scale objects in satellite optical imagery is an important task in remote sensing and computer vision. Despite significant research efforts, such detection remains largely unsolved due to the diversity of patterns in orientation, scale, aspect ratio, and visual appearance; the dense distribution of objects; and extreme imbalances in categories. In this paper, we propose an adaptive dynamic refined single-stage transformer detector to address the aforementioned challenges, aiming to achieve high recall and speed. Our detector realizes rotated object detection with RetinaNet as the baseline. Firstly, we propose a feature pyramid transformer (FPT) to enhance feature extraction of the rotated object detection framework through a feature interaction mechanism. This is beneficial for the detection of objects with diverse patterns in terms of scale, aspect ratio, visual appearance, and dense distributions. Secondly, we design two special post-processing steps for rotated objects with arbitrary orientations, large aspect ratios and dense distributions. The output features of FPT are fed into post-processing steps. In the first step, it performs the preliminary regression of locations and angle anchors for the refinement step. In the refinement step, it performs adaptive feature refinement first and then gives the final object detection result precisely. The main architecture of the refinement step is dynamic feature refinement (DFR), which is proposed to adaptively adjust the feature map and reconstruct a new feature map for arbitrary-oriented object detection to alleviate the mismatches between rotated bounding boxes and axis-aligned receptive fields. Thirdly, the focus loss is adopted to deal with the category imbalance problem. Experiments on two challenging satellite optical imagery public datasets, DOTA and HRSC2016, demonstrate that the proposed ADT-Det detector achieves a state-of-the-art detection accuracy (79.95% mAP for DOTA and 93.47% mAP for HRSC2016) while running very fast (14.6 fps with a 600 × 600 input image size).

Highlights

  • IntroductionIn the past few decades, Earth observation satellites have been monitoring changes in the Earth’s surface and the amount and resolution of satellite optical images have been greatly improved

  • Our detector realizes rotated object detection with RetinaNet as the baseline to achieve the detection of multi-scale objects and densely distributed objects

  • The key idea of dynamic feature refinement (DFR) is to adaptively adjust the feature map and reconstruct a new feature map for arbitrary-oriented object detection to alleviate the mismatches between the rotated bounding box and the axis-aligned receptive fields

Read more

Summary

Introduction

In the past few decades, Earth observation satellites have been monitoring changes in the Earth’s surface and the amount and resolution of satellite optical images have been greatly improved. The task of object detection in satellite optical images is to localize interest objects (such as vehicles, ships, aircraft, buildings, airports, ports) and identify their categories. This has numerous practical applications in satellite remote sensing and computer vision, warning of natural disasters, Earth surveying and mapping, and surveillance and traffic planning. Much progress in general-purpose horizontal detectors has been achieved by advances in deep convolutional neural networks (DCNNs) and the emergence of large datasets [1]. Unlike natural images that are usually taken from horizontal

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call