Abstract

Benefiting from free labeling pixel-level samples, weakly supervised semantic segmentation (WSSS) is making progress in automatically extracting building from high-resolution (HR) remote sensing (RS) imagery. For WSSS methods, generating high-quality pseudomasks is crucial for accurate building extraction.To improve the performance of generating pseudomasks by using image-level labels, this article proposes a weakly supervised building extraction method by combining adversarial climbing and gated convolution. The proposed method optimizes class activation maps (CAMs) by using adversarial climbing strategy, generates accurate class boundary maps by introducing a gated convolution module, and further refines building pseudomasks by fusing pairing semantic affinities and CAMs with a random walk strategy. Experimental results on three datasets—two ISPRS datasets and a self-annotated dataset—demonstrate that the proposed approach outperformed SOTA WSSS methods, leading to improvement of building extraction from HR RS imager. This article provides a new approach for optimizing pseudomasks generation, and a methodological reference for the applications of weakly supervised on RS images.

Highlights

  • A UTOMATIC building extraction plays an important role in urban planning [1], building change detection [2], [3], geographic data mapping, and updating [4]

  • The shadow interference, the regular building shapes, and irregular building shapes, the proposed method can reserve more integral building regions and clearer building boundaries. These results suggest that ACGC is promising

  • ACGC achieves 93.4% and 96.4% performance of the fully supervised method in terms of intersection over union (IoU) and F1-score on the Postsdam dataset and achieves 89.6% and 94.2% performance on the Vaihingen dataset

Read more

Summary

Introduction

A UTOMATIC building extraction plays an important role in urban planning [1], building change detection [2], [3], geographic data mapping, and updating [4]. With the increasing amount of high-resolution (HR) remote sensing (RS) imagery becoming an important and reliable data source, efficiently and accurately extracting buildings from RS images is significant and challenging. Buildings can be extracted by classifying pixels of RS images as building or nonbuildings, which is regarded as a semantic segmentation task in computer vision [5]. Represented by fully convolutional networks (FCNs) [6], an increasing body of supervised approaches have been developed to advance building extraction tasks, achieving significant performance improvements. These methods usually require a large number of labeled pixel-level samples to learn model parameters, especially given the great variation of buildings in HR imagery across regions. Developing new methods to effectively extract buildings from HR RS images by utilizing available labeled datasets or obtained labels is urgently needed [8]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call