Abstract

Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.

Highlights

  • Due to the rapid and continuous development of space technology, remote sensing image analysis (RSIA) has become a popular research field for earth observation [1,2].As a fundamental task among RSIA, semantic segmentation on remote sensing images, especially very high-resolution (VHR) remote sensing images, offers a variety of opportunities and applications for land use and land cover (LULC) investigation [3], environment monitoring [4], precision agriculture [5], urban planning [6], meteorology [7], etc.Semantic segmentation aims to interpret images by segmenting them into semantic objects and assigning each pixel to one of the predetermined category

  • In order to overcome the mentioned drawbacks of existing methods, we propose an improved RS image semantic segmentation framework SDFCNv2 in this paper

  • The Potsdam dataset is provided by the International Society for Photogrammetry and Remote Sensing (ISPRS), and consists of digital orthophoto maps (DOMs) generated from aerial images of Germany

Read more

Summary

Introduction

Due to the rapid and continuous development of space technology, remote sensing image analysis (RSIA) has become a popular research field for earth observation [1,2].As a fundamental task among RSIA, semantic segmentation on remote sensing images, especially very high-resolution (VHR) remote sensing images, offers a variety of opportunities and applications for land use and land cover (LULC) investigation [3], environment monitoring [4], precision agriculture [5], urban planning [6], meteorology [7], etc.Semantic segmentation aims to interpret images by segmenting them into semantic objects and assigning each pixel to one of the predetermined category. With the continuous development and improvement of deep learning, fully convolutional neural networks (FCNs) achieve more accurate and stable performance in semantic segmentation tasks than traditional methods [8]. FCNs were proposed by Long et al in pixel-wise semantic segmentation tasks in. Decoders replace fully connected layers in CNNs with deconvolutional or upsampling layers, convert the feature map into a classification map with the same size as the input, and perform pixel-level predictions on images. Typical FCN models based on the encoder–decoder architecture (EDA) have been developed and proved to be effective in segmenting multi-class objects. Existing EDA-based FCN models for semantic segmentation tasks on natural scene images in computer vision (CV) community include.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.