Abstract

The objects in remote sensing images are normally densely packed, arbitrarily oriented, and surrounded by complex backgrounds. Great efforts have been devoted to developing oriented object detection models to accommodate such data characteristics. We argue that an effective detection model hinges on three aspects: feature enhancement, feature decoupling for classification and localization, and an appropriate bounding box regression scheme. In this article, we instantiate the three aspects on top of the classical Faster R-CNN, with three novel components proposed. First, we propose a weighted fusion and refinement (WFR) module, which adaptively weighs multi-level features and leverages the attention mechanism to refine the fused features. Second, we decouple the RoI (region of interest) features for the subsequent classification and localization via a lightweight affine transformation-based feature decoupling (ATFD) module. Third, we propose a post-classification regression (PCR) module for generating the desired quadrilateral bounding boxes. Specifically, PCR predicts the precise vertex location on each side of a predicted horizontal box, by simply learning the following: (i) classify the discretized regression range of the vertex, and (ii) revise the vertex location with an offset. We conduct extensive experiments on the DOTA, DIOR-R, and HRSC2016 datasets to evaluate our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call