Semantic Segmentation for Buildings of Large Intra-Class Variation in Remote Sensing Images with O-GAN

Shuting Sun,Xiaolei Liu,Lin Mu,Lizhe Wang,Yuwei Zhang,Peng Liu

doi:10.3390/rs13030475

Abstract

Remote sensing building extraction is of great importance to many applications, such as urban planning and economic status assessment. Deep learning with deep network structures and back-propagation optimization can automatically learn features of targets in high-resolution remote sensing images. However, it is also obvious that the generalizability of deep networks is almost entirely dependent on the quality and quantity of the labels. Therefore, building extraction performances will be greatly affected if there is a large intra-class variation among samples of one class target. To solve the problem, a subdivision method for reducing intra-class differences is proposed to enhance semantic segmentation. We proposed that backgrounds and targets be separately generated by two orthogonal generative adversarial networks (O-GAN). The two O-GANs are connected by adding the new loss function to their discriminators. To better extract building features, drawing on the idea of fine-grained image classification, feature vectors for a target are obtained through an intermediate convolution layer of O-GAN with selective convolutional descriptor aggregation (SCDA). Subsequently, feature vectors are clustered into new, different subdivisions to train semantic segmentation networks. In the prediction stages, the subdivisions will be merged into one class. Experiments were conducted with remote sensing images of the Tibet area, where there are both tall buildings and herdsmen’s tents. The results indicate that, compared with direct semantic segmentation, the proposed subdivision method can make an improvement on accuracy of about 4%. Besides, statistics and visualizing building features validated the rationality of features and subdivisions.

Highlights

Building, a major artificial pastime closely related to humans, is an important symbol of human activities
In this paper, inspired by the idea of orthogonal generative adversarial networks (O-generative adversarial networks (GAN)), we proposed a new architecture that consists of two connected GANs
The O-GAN model is implemented on Keras (Version 2.2.4) and an NVIDIA RTX 2080Ti GPU platform

Summary

Introduction

A major artificial pastime closely related to humans, is an important symbol of human activities. Building information can further be used to research human activities, land use change, population estimation and prediction, disaster assessment, etc. Building extraction from remote sensing images has become an important way of acquiring information. Conventional building extraction methods mainly include traditional machine learning methods, such as the Normalized Difference Built-up Index (NDBI) [1], Morphological Shadow Index (MSI) [2], and Adaboost [3]. With the advent of high-resolution remote sensing images, a wealth of detailed information is provided, which makes it possible to acquire clear images of small buildings such as residences and temporary houses. The complex texture and fragility of features is revealed by very high resolutions, so it is often difficult to model building

Objectives

Methods

Results

Discussion

Conclusion