Abstract

Deep convolutional neural networks have promoted significant progress in building extraction from high-resolution remote sensing imagery. Although most of such work focuses on modifying existing image segmentation networks in computer vision, we propose a new network in this paper, Deep Encoding Network (DE-Net), that is designed for the very problem based on many lately introduced techniques in image segmentation. Four modules are used to construct DE-Net: the inception-style downsampling modules combining a striding convolution layer and a max-pooling layer, the encoding modules comprising six linear residual blocks with a scaled exponential linear unit (SELU) activation function, the compressing modules reducing the feature channels, and a densely upsampling module that enables the network to encode spatial information inside feature maps. Thus, DE-Net achieves state-of-the-art performance on the WHU Building Dataset in recall, F1-Score, and intersection over union (IoU) metrics without pre-training. It also outperformed several segmentation networks in our self-built Suzhou Satellite Building Dataset. The experimental results validate the effectiveness of DE-Net on building extraction from aerial imagery and satellite imagery. It also suggests that given enough training data, designing and training a network from scratch may excel fine-tuning models pre-trained on datasets unrelated to building extraction.

Highlights

  • Buildings are fundamental elements of a physical urban environment [1]

  • Deep Encoding Network (DE-Net) is trained by dice and binary cross-entropy loss to address the sample imbalance problem in building extraction

  • Note that SRI-Net [47] is directly trained on the WHU dataset, and SRI-Net-UC is pre-trained on the University of California (UC) Merced Land Use Dataset and trained on the WHU dataset

Read more

Summary

Introduction

Buildings are fundamental elements of a physical urban environment [1] Information such as the location, size, and number, of buildings is indispensable for many geographic and social applications, e.g., building thematic mapping [2], land-use mapping [3], urban planning [4], change detection [5], population estimation [6], etc. Such applications require a widely covered range, high accuracy, and regular updates, which makes high-resolution remote sensing (HRRS) imagery the most suitable data source. A desirable method is one that effectively generalizes to various unseen situations, and this is where deep convolution neural networks (DCNNs) shine

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.