Abstract

As the acquisition of very high resolution (VHR) images becomes easier, the complex characteristics of VHR images pose new challenges to traditional machine learning semantic segmentation methods. As an excellent convolutional neural network (CNN) structure, U-Net does not require manual intervention, and its high-precision features are widely used in image interpretation. However, as an end-to-end fully convolutional network, U-Net has not explored enough information from the full scale, and there is still room for improvement. In this study, we constructed an effective network module: residual module under a multisensory field (RMMF) to extract multiscale features of target and an attention mechanism to optimize feature information. RMMF uses parallel convolutional layers to learn features of different scales in the network and adds shortcut connections between stacked layers to construct residual blocks, combining low-level detailed information with high-level semantic information. RMMF is universal and extensible. The convolutional layer in the U-Net network is replaced with RMMF to improve the network structure. Additionally, the multiscale convolutional network was tested using RMMF on the Gaofen-2 data set and Potsdam data sets. Experiments show that compared to other technologies, this method has better performance in airborne and spaceborne images.

Highlights

  • Remote sensing (RS) is a comprehensive earth observation technology developed in the 1960s, which can realize repeated detection of the same area in a short period of time

  • There is obvious “spiced salt” phenomenon in SegNet and FCN8s, which indicates that the upsampling operation adopted by SegNet and FCN8s cannot improve the accuracy of semantic segmentation model very well

  • This paper proposes a novel model, which has the following advantages compared with other models

Read more

Summary

Introduction

Remote sensing (RS) is a comprehensive earth observation technology developed in the 1960s, which can realize repeated detection of the same area in a short period of time. CNN has achieved surprising results in processing remote sensing images, such as scene classification and semantic segmentation [30,31,32] As they can automatically generate powerful and representative features layer by layer in the neural network without human intervention [33], they can mine the spatial dependence between various segmented objects in the images, providing multiple methods for VHR image semantic segmentation [34,35,36]. The major contributions of this work are: (1) proposing a Multi-U-Net model combining U-Net, RMMF, and an attention mechanism to extract ground objects from VHR images; (2) devising RMMF to learn multi-scale features so that global and local features can be excavated; and (3) using data sets obtained by different sensors to compare the performance of the state-of-the-art deep models.

Proposed Method
Residual Module under Multisensory Field
Network Structure
SSttuuddyy AArreeaa aanndd DDatta Source
Data Processing and Method Implementation
Accuracy Assessment
IoU F1 IoU F1 IoU F1 IoU F1 IoU F1 IoU F1 IoU OA mIoU
Comparison of the Results of the ISPRS Potsdam Data Set
Model Efficiency Analysis
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call