Abstract

Semantic labeling for high resolution aerial images is a fundamental and necessary task in remote sensing image analysis. It is widely used in land-use surveys, change detection, and environmental protection. Recent researches reveal the superiority of Convolutional Neural Networks (CNNs) in this task. However, multi-scale object recognition and accurate object localization are two major problems for semantic labeling methods based on CNNs in high resolution aerial images. To handle these problems, we design a Context Fuse Module, which is composed of parallel convolutional layers with kernels of different sizes and a global pooling branch, to aggregate context information at multiple scales. We propose an Attention Mix Module, which utilizes a channel-wise attention mechanism to combine multi-level features for higher localization accuracy. We further employ a Residual Convolutional Module to refine features in all feature levels. Based on these modules, we construct a new end-to-end network for semantic labeling in aerial images. We evaluate the proposed network on the ISPRS Vaihingen and Potsdam datasets. Experimental results demonstrate that our network outperforms other competitors on both datasets with only raw image data.

Highlights

  • Semantic labeling means assigning each pixel a category label, which is known as semantic segmentation in the computer vision field

  • We evaluate our proposed Context Aggregation Network (CAN) model on two public datasets: the ISPRS Vaihingen dataset and ISPRS Potsdam dataset

  • We introduce a novel end-to-end context aggregation network (CAN) for semantic labeling in aerial images

Read more

Summary

Introduction

Semantic labeling means assigning each pixel a category label, which is known as semantic segmentation in the computer vision field. Different from digital photos widely used in the computer vision field, objects present large scale variations and compose complex scenes in aerial images, especially in urban areas. They consist of many kinds of objects, like buildings, vegetation, trees, etc. Cars are tiny, while trees are interwoven with vegetation These properties make the semantic labeling task more difficult. It is a necessary task in aerial image interpretation in spite of its difficulty and becomes the basis for following applications, including land-use analysis, environmental protection, urban change detection, urban planning, and so on [4,5,6,7,8]

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.