Abstract

Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61% and 77.75%, respectively, and the F1-measure of the Massachusetts buildings dataset is 96.36%) and outperforms several state-of-the-art approaches.

Highlights

  • With the rapid advancement of aerospace remote sensing, the amount and spatial resolution of high-resolution remote sensing images are increasing rapidly

  • intersection over union (IoU) is the intersection of pixels labeled as building in the predicted results and ground truths, divided by the union of pixels labeled as building in the predicted results and ground truths (see Equation (9))

  • The Mnih-Convolutional neural networks (CNNs)+conditional random fields (CRFs) was proposed by the producers of the Massachusetts building dataset, which belonged to the patch-based category, and CRFs was included as a post-processor

Read more

Summary

Introduction

With the rapid advancement of aerospace remote sensing, the amount and spatial resolution of high-resolution remote sensing images are increasing rapidly. Accurate and automatic semantic labeling of high-resolution remote sensing images is of great significance and receives wide attention [1]. Large intra-class variance and small inter-class differences of higher spatial resolution remote sensing images may cause classification ambiguities, which makes semantic segmentation of high-resolution remote sensing images a challenge. Some regions have small and very dense buildings, whilst some other regions have low-density buildings. This variability brings great challenges to the building segmentation task, and requires strong generalization capabilities of classification techniques [2,3]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call