Abstract
High-resolution remote sensing image object detection plays an increasingly important role in image processing and interpretation. The application of region-based convolutional neural network (R-CNN) greatly enhances the performance of object detection. However, the attributes of remote sensing images such as overlarge image size, similar background, disequilibrium distribution of categories make this task more challenging. The previous works have focused on extracting multi-scale features of region proposals, often ignoring the quality of region of interest (ROI). In this work, we proposed a patch-based three-stage aggregation network (PTAN) for object detection in high-resolution remote sensing images. It consists of a three-stage cascade structure that sequentially improves the quality of candidate regions by increasing the IoU threshold stage by stage, and adopts a resampling strategy to obtain sufficient region proposals. At the same time, we also proposed patch-based strategy and applied it to the framework during training and inference. Ablation experiments and comprehensive evaluations on a communal remote sensing image object detection dataset DOTA demonstrate the effectiveness and robustness of the proposed framework, which obtained a mean average precision (mAP) value of 0.7958 on validation dataset and a front-rank mAP of 0.7858 on testing dataset. On another remote sensing image object detection dataset NWPU VHR-10, the proposed PTAN obtained a mAP value of 0.9187, outperforming other five object detectors.
Highlights
Object detection in high resolution remote sensing image is a core issue of image analysis and interpretation, which mainly includes two tasks: classification and regression [1]
To obtain sufficient positive region proposals with high intersection over union (IoU) threshold, and achieve high precision remote sensing image object detection, we propose an end-to-end framework, namely, Patch-based Three-stage Aggregation Network (PTAN) in this work
METHODOLOGY we first elaborate the architecture of the proposed patch-based three-stage aggregation network (PTAN) in detail, and introduce how the cascade network works during training and inference
Summary
Object detection in high resolution remote sensing image is a core issue of image analysis and interpretation, which mainly includes two tasks: classification and regression [1]. To obtain sufficient positive region proposals with high IoU threshold, and achieve high precision remote sensing image object detection, we propose an end-to-end framework, namely, Patch-based Three-stage Aggregation Network (PTAN) in this work. It first adopts the region proposal network (RPN) [10] to generate numerous candidate boxes, and obtain positive samples with high IoU threshold through a three-stage aggregation network. These positive samples are sent to the subsequent detectors to achieve accurate object detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.