Abstract

ABSTRACT Semantic segmentation for high-resolution remote sensing imagery is a pivotal component of land use and land cover categorization, and height estimation is essential for rebuilding the 3D information of an image. Because of the higher intra-class variation and smaller inter-class dissimilarity, these two challenging tasks are generally treated separately. This paper proposes a fully convolutional network that can tackle these problems simultaneously by estimating the land-cover categories and height values of pixels from a single aerial image. To handle these tasks, we develop a multi-task learning architecture (JSH-Net) that employs a shared feature representation and exploits their potential consistency across tasks, resulting in robust features and better prediction accuracy. Specifically, we propose a novel skip connection module that aggregates the contexts from the encoder part to the decoder part, bridging the semantic gap between them. In addition, we propose a progressive refinement strategy to recover detailed information about the objects. Moreover, we also proposed a height estimation branch on the head of the model to utilize shared features. The experiments we conducted on ISPRS 2D Labelling dataset verified that our network provided precise results of semantic segmentation and height estimation from two output branches and outperformed other state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.