Abstract

Recently, the deep convolutional neural network has brought great improvements in object detection. However, the balance between high accuracy and high speed has always been a challenging task in multiclass object detection for large-scale remote sensing imagery. One-stage methods are more widely used because of their high efficiency but are limited by their performances on small object detection. In this article, we propose a unified framework called feature-merged single-shot detection (FMSSD) network, which aggregates the context information both in multiple scales and the same scale feature maps. First, our network leverages the atrous spatial feature pyramid (ASFP) module to fuse the context information in multiscale features by using feature pyramid and multiple atrous rates. Second, we propose a novel area-weighted loss function to pay more attention to small objects, while the replaced original loss treats all objects equally. We believe that small objects should be given more weight than large objects because they lose more information during training. Specifically, a monotonic decreasing function about the area is designed to add weights on the loss function. Extensive experiments on the DOTA data set and NWPU VHR-10 data set demonstrate that our method achieves state-of-the-art detection accuracy with high efficiency. We also build a new large-scale data set called AIR-OBJ data set from Google Earth and show the detection results of small objects, which validates the effectiveness on large-scale remote sensing imagery.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call