Abstract
Object detection in aerial image is a challenging task. Although many advanced methods based on the convolutional neural network were popular in natural scenes, the progress in aerial images is not so smooth. Unlike natural scenes, objects in aerial images have the characteristics of arbitrary orientation, densely distribution, and large scale variation, which leads to a series of problems such as feature misalignment, missed detection, and poor detection of large aspect ratio objects. In this paper, a Refined Oriented Staged Detector (ROSD) with the combination of refined horizontal detector and rotated detector is proposed to address these problems. In our refined horizontal detector, a multi-orientation Region of Interest (RoI) Align and Orientation Attention Module (OAM) are adopted to make use of the orientation information for obtaining the orientation-sensitive features in regression branch and produce the orientation-invariant features in classification branch. Considering the feature misalignment between horizontal and rotated anchors in rotated detector, Deform Inception Module (DIM) is proposed to deal with the geometric deformation problem caused by the location changes. Besides, we propose an aspect ratio guided loss which consists of a smooth L1 loss and an angular offset penalty loss to improve the detection performance of large aspect ratio objects. Comparison experiments on two public aerial images datasets (i.e., DOTA and HRSC2016) demonstrate that our method can achieve a competitive performance.
Highlights
Object detection is an important task in computer vision
Most state-of-the-art detectors rely on the Region-CNN (R-CNN) frameworks, which consist of two parts: a Region Proposal Network (RPN) and an R-CNN head for detection
To maintain a promising performance on object detection, we propose a Refined Oriented Staged Detector (ROSD) which consists of two parts: a refined horizontal detector and an oriented detector
Summary
Object detection is an important task in computer vision. Many high-performance general object detectors have been proposed in recent years with the framework of Deep Convolutional Neural Networks (DCNNs) [1]–[5]. Current object detector modules can be generally divided into two types: one-stage detectors and two-stage detectors. Two-stage detectors can achieve better detection performance, whereas the one-stage detectors benefit from the detection speed. Most state-of-the-art detectors rely on the Region-CNN (R-CNN) frameworks, which consist of two parts: a Region Proposal Network (RPN) and an R-CNN head for detection. The RPN is used to refine horizontal anchors to generate more accurate Region of Interests (RoIs), and a RoI pooling operator is adopted to extract features for
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.