Abstract

Scale diversity, small target, and power limitation have made remote sensing imagery a challenging field in object detection on satellites. Aiming at the aspects of scale diversity and small target, this paper provides a novel feature pyramid network with Adaptive Residual Spatial Bi-Fusion (ARSF) as a solution. ARSF nets introduce a robust fusion of multi-scale semantic information and fine spatial details. A spatial feature fusion module designed in networks with ARSF adapts to object size variation by learning the most crucial feature maps. Comparing to the original feature pyramid network, a shorter critical path for information transmission is formed in our method. Experiments show that a validation instance of YOLOv3-ARSF can achieve a state-of-the-art performance of 85.8 mAP on the NWPU-VHR10 dataset. YOLOv3-ARSF only 3MB larger than YOLOv3 but far exceeds YOLOv3 by 2.3% mAP, which shows our ARSF is efficient. As for the last challenge, two lightweight versions, ARSF(lite) and ARSF(lite+) are also validated for future research of online object detection on satellites in aerospace engineering. Visualizations and details are provided for a more comprehensive understanding.

Highlights

  • With the rapid development of remote sensing technology, massive remote sensing image data have been generated by satellites

  • We take the NWPU-VHR10 [34] to prove that our proposed Adaptive Residual Spatial Bi-Fusion (ARSF) network achieves the performance of SoTA detectors

  • If the IoU between the prediction bounding box and the ground truth is larger than 0.5, it will be considered as true positive (TP); otherwise, it will be considered as false positive (FP)

Read more

Summary

INTRODUCTION

With the rapid development of remote sensing technology, massive remote sensing image data have been generated by satellites. As for the limited power, many papers are applying deep learning methods to achieve object detection in remote sensing images [8]–[11], the network scale and calculation volume of these papers are plentiful, and it is still challenging to complete the efficient detection on satellites with limited onboard memory and computing power. FEATURE PYRAMID NETWORK Since some pooling layers are repeatedly applying in CNN to extract advanced semantics, the information of small objects can be filtered out during the downsampling process To cope with this problem, FPN [3] utilized a top-down path module to fuse different level features, which noticeably increases the performance of detectors.

EVALUATION INDICATORS
COST-FREE TRICKS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call