Abstract

Although the orientation and scale properties of the objects in remote sensing images have been widely considered in the modern deep learning-based object detection methods, the spatial distribution property of objects has rarely been investigated. There is a distinct spatial distribution difference between close-range objects and remote sensing objects: the former may exhibit extensive mutual occlusion and overlap, whereas the latter rarely overlap. A current remote sensing object detection algorithm that ignores the spatial distribution difference may unnecessarily apply the massive anchor-based proposal bounding box generation and nonmaximum suppression (NMS) operations. In this article, considering the unique spatial distribution of remote sensing objects, and also the other spatial properties, we propose a novel, compact, and spatial-oriented object detection framework for remote sensing images. The proposed two-stage convolutional neural network (CNN) framework, which we call the Remote-sensing Spatial Adaptation DETector (RSADet), considers the spatial distribution, scale, and orientation/shape varieties of the objects in remote sensing images. In the first stage, each object instance is inferred on the scale-attention boosted CNN heatmaps to generate candidate bounding boxes, instead of using the anchor-based proposal box generation and NMS. In the second stage, deformable convolutions are introduced to adapt to the geometric variations of different object instances and to avoid the impact of complex and changeable backgrounds. A new bounding box confidence (IoU score) prediction branch is introduced as a convenient constraint for eliminating unreliable boxes and improving performance. Experiments were conducted on a large single-class remote sensing object detection dataset (the Ningbo Pylon dataset) built as part of this study and an open-source extraordinarily large multiclass dataset (the object DetectIon in Optical Remote sensing image (DIOR) dataset). Compared with the advanced detectors from both the computer vision and remote sensing communities, the proposed RSADet achieved state-of-the-art performance on both datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call