Abstract

Semantic segmentation of remote sensing images (RSI) plays a significant role in urban management and land cover classification. Due to the richer spatial information in the RSI, existing convolutional neural network (CNN)-based methods cannot segment images accurately and lose some edge information of objects. In addition, recent studies have shown that leveraging additional 3D geometric data with 2D appearance is beneficial to distinguish the pixels’ category. However, most of them require height maps as additional inputs, which severely limits their applications. To alleviate the above issues, we propose a height aware-multi path parallel network (HA-MPPNet). Our proposed MPPNet first obtains multi-level semantic features while maintaining the spatial resolution in each path for preserving detailed image information. Afterward, gated high-low level feature fusion is utilized to complement the lack of low-level semantics. Then, we designed the height feature decode branch to learn the height features under the supervision of digital surface model (DSM) images and used the learned embeddings to improve semantic context by height feature guide propagation. Note that our module does not need a DSM image as additional input after training and is end-to-end. Our method outperformed other state-of-the-art methods for semantic segmentation on publicly available remote sensing image datasets.

Highlights

  • Attributed to the rapid development of satellite observation technology, a large number of high spatial resolution (HSR) remote sensing images can be acquired

  • We aimed to preserve the spatial information of remote sensing images and utilize digital surface model (DSM) images to strengthen the semantic context during the process of segmentation

  • 1, our method method outperformed all of the compared methods with the highest and mean outperformed all of the compared methods with the highest overall accuracy (OA) of 91.54% and mean intersection union

Read more

Summary

Introduction

Attributed to the rapid development of satellite observation technology, a large number of high spatial resolution (HSR) remote sensing images can be acquired. Extracting objects such as buildings, cars, and trees from remote sensing images is significant for land cover classification [1], urban management [2], and city planning [3]. Formation, which becomes the main obstacle in accurately extracting spatial information from remote sensing images To solve this problem, DeepLabv1 [11] uses conditional random fields (CRF) for post-processing to optimize the segmentation of edges and Hierarchical [12] tries to enlarge the scale of the input image to obtain a high-resolution result, both of them increase the number of network calculations. HRNet [13] 2 of 18 has proposed a high-resolution CNN to obtain semantic features while maintaining a high-resolution representation, but its high-level semantics are not rich

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call