Abstract

Object detection has attracted increasing attention in the field of remote sensing image analysis. Complex backgrounds, vertical views, and variations in target kind and size in remote sensing images make object detection a challenging task. In this work, considering that the types of objects are often closely related to the scene in which they are located, we propose a convolutional neural network (CNN) by combining scene-contextual information for object detection. Specifically, we put forward the scene-contextual feature pyramid network (SCFPN), which aims to strengthen the relationship between the target and the scene and solve problems resulting from variations in target size. Additionally, to improve the capability of feature extraction, the network is constructed by repeating a building aggregated residual block. This block increases the receptive field, which can extract richer information for targets and achieve excellent performance with respect to small object detection. Moreover, to improve the proposed model performance, we use group normalization, which divides the channels into groups and computes the mean and variance for normalization within each group, to solve the limitation of the batch normalization. The proposed method is validated on a public and challenging dataset. The experimental results demonstrate that our proposed method outperforms other state-of-the-art object detection models.

Highlights

  • Object detection in remote sensing images is of great importance for many practical applications, such as urban planning and urban ecological environment evaluation

  • In this paper, we propose a multi-scale convolutional neural network (CNN)-based detection method called a scene-contextual feature pyramid network (SCFPN), which is based on an FPN, by combining scene-contextual features with a backbone network

  • The main contributions of this paper are as follows: 1. We propose the scene-contextual feature pyramid network, which is based on a multi-scale detection framework, to enhance the relationship between scene and target and ensure the effectiveness of the detection of multi-scale objects

Read more

Summary

Introduction

Object detection in remote sensing images is of great importance for many practical applications, such as urban planning and urban ecological environment evaluation. With the objectivity scores, the faster R-CNN model can filter out many low-scoring ROIs and shorten the detection time These methods have achieved good results in object detection, both of them still adopt a single-scale feature layer in which the detection of targets with various scales, especially small-sized objects, is not effective. In this paper, we propose a multi-scale CNN-based detection method called a scene-contextual feature pyramid network (SCFPN), which is based on an FPN, by combining scene-contextual features with a backbone network. We propose the scene-contextual feature pyramid network, which is based on a multi-scale detection framework, to enhance the relationship between scene and target and ensure the effectiveness of the detection of multi-scale objects.

Proposed Method
C4 C3 C2 C1 top-down
Backbone Network
Group Normalization
Experiments
Training Details
Method Testing Time
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call