Detecting objects in remote sensing images (RSIs) using oriented bounding boxes (OBBs) is flourishing but challenging, wherein the design of OBB representations is the key to achieving accurate detection. In this article, we focus on two issues that hinder the performance of the two-stage oriented detectors: 1) the notorious boundary discontinuity problem, which would result in significant loss increases in boundary conditions, and 2) the inconsistency in regression schemes between the two stages. We propose a simple and effective bounding box representation by drawing inspiration from the polar coordinate system and integrate it into two detection stages to circumvent the two issues. The first stage specifically initializes four quadrant points as the starting points of the regression for producing high-quality oriented candidates without any postprocessing. In the second stage, the final localization results are refined using the proposed novel bounding box representation, which can fully release the capabilities of the oriented detectors. Such consistency brings a good trade-off between accuracy and speed. With only flipping augmentation and single-scale training and testing, our approach with ResNet-50-FPN harvests 76.25% mAP on the DOTA dataset with a speed of up to 16.5 frames/s, achieving the best accuracy and the fastest speed among the mainstream two-stage oriented detectors. Additional results on the DIOR-R and HRSC2016 datasets also demonstrate the effectiveness and robustness of our method. The source code is publicly available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/yanqingyao1994/QPDet</uri> .
Read full abstract