Urban Land Cover Classification of High-Resolution Aerial Imagery Using a Relation-Enhanced Multiscale Convolutional Network

Chun Liu,Yin Wang,Doudou Zeng,Shoujun Jia,Hangbin Wu,Liang Xin

doi:10.3390/rs12020311

Abstract

Urban land cover classification for high-resolution images is a fundamental yet challenging task in remote sensing image analysis. Recently, deep learning techniques have achieved outstanding performance in high-resolution image classification, especially the methods based on deep convolutional neural networks (DCNNs). However, the traditional CNNs using convolution operations with local receptive fields are not sufficient to model global contextual relations between objects. In addition, multiscale objects and the relatively small sample size in remote sensing have also limited classification accuracy. In this paper, a relation-enhanced multiscale convolutional network (REMSNet) method is proposed to overcome these weaknesses. A dense connectivity pattern and parallel multi-kernel convolution are combined to build a lightweight and varied receptive field sizes model. Then, the spatial relation-enhanced block and the channel relation-enhanced block are introduced into the network. They can adaptively learn global contextual relations between any two positions or feature maps to enhance feature representations. Moreover, we design a parallel multi-kernel deconvolution module and spatial path to further aggregate different scales information. The proposed network is used for urban land cover classification against two datasets: the ISPRS 2D semantic labelling contest of Vaihingen and an area of Shanghai of about 143 km2. The results demonstrate that the proposed method can effectively capture long-range dependencies and improve the accuracy of land cover classification. Our model obtains an overall accuracy (OA) of 90.46% and a mean intersection-over-union (mIoU) of 0.8073 for Vaihingen and an OA of 88.55% and a mIoU of 0.7394 for Shanghai.

Highlights

Urban land use and land cover information is essential for understanding the constant changes on the surface of the Earth and associated socioecological interactions [1]
We show the extensive experimental results obtained with the proposed method and other typical deep learning models for semantic segmentation, including SegNet [31], MobileNet [59], deeplabv3 [15], FC-DenseNet [50], and Pyramid Scene Parsing Network (PSPNet) [30]
It is demonstrated that relation-enhanced multiscale convolutional network (REMSNet) outperforms other methods in terms of the mean F1 score, mean intersection-over-union (mIoU), and overall accuracy

Summary

Introduction

Urban land use and land cover information is essential for understanding the constant changes on the surface of the Earth and associated socioecological interactions [1]. With advances in remote sensing data acquisition technologies, a huge amount of remote sensing images with high spatial resolution are steadily becoming more widespread [5,6]. This opens new opportunities for urban land cover information extraction at a very detailed level [7]. It is urgent to interpret high-spatial-resolution remote sensing images in intelligent and automatic methods for land cover classification [8,9]

Methods

Results

Conclusion