Abstract

The remote sensing (RS) images are widely used in various industries, among which semantic segmentation of RS images is a common research direction. At the same time, because of the complexity of target information and the high similarity of features between the classes, this task is very challenging. In recent years, semantic segmentation algorithms of RS images have emerged in an endless stream, but most of them are improved around the scale features of the target, and the accuracy has great room for improvement. In this case, we propose a semantic segmentation framework for RS images with dynamic perceptual loss. The framework is improved based on the InceptionV-4 network to form a network that includes contextual semantic fusion and dual-channel atrous spatial pyramid pooling (ASPP). The semantic segmentation network is an encoder-decoder structure. In addition, we design a dynamic perceptual loss module and a dynamic loss fusion strategy by further observing the loss changes of the network, so as to better improve the classified details. Finally, experiment on the ISPRS 2D Semantic Labeling Contest Vaihingen Dataset and Massachusetts Building Dataset. Compared with some segmentation networks, our model has excellent performance.

Highlights

  • In recent years, the development of aerospace technology and sensors has provided sufficient conditions for the utilization of remote sensing (RS) images

  • This framework is roughly divided into four parts: The inceptionV-4 network as the backbone, dual Atrous Spatial Pyramid Pooling (ASPP) module, decoder module, and perceptual loss network

  • In order to comprehensively evaluate the performance of the model, this work uses three benchmark indicators, namely Intersection Over Union (IOU), Overall Accuracy (OA) and F1 score (F1)

Read more

Summary

INTRODUCTION

The development of aerospace technology and sensors has provided sufficient conditions for the utilization of RS images. PROPOSED MODEL In this work, a deep learning framework for semantic segmentation of RS images is built, which can well solve some problems encountered in the semantic segmentation of the RS images This framework is roughly divided into four parts: The inceptionV-4 network as the backbone, dual ASPP module, decoder module, and perceptual loss network. Since the pre-trained convolutional neural network has encoded the perceptual and semantic information calculated in the loss function, it has good fitting parameters, so the high-level features are more similar, which makes the training details of the network better and the performance is improved. We use the pre-trained VGG19 network as the loss network to build a complete semantic segmentation framework for RS images with perceptual loss. We add a weight to the feature loss to dilute the impact of the loss fusion

EVALUATION CRITERIA
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.