Abstract

Abstract. In this paper, we address the deep semantic segmentation of aerial imagery based on multi-modal data. Given multi-modal data composed of true orthophotos and the corresponding Digital Surface Models (DSMs), we extract a variety of hand-crafted radiometric and geometric features which are provided separately and in different combinations as input to a modern deep learning framework. The latter is represented by a Residual Shuffling Convolutional Neural Network (RSCNN) combining the characteristics of a Residual Network with the advantages of atrous convolution and a shuffling operator to achieve a dense semantic labeling. Via performance evaluation on a benchmark dataset, we analyze the value of different feature sets for the semantic segmentation task. The derived results reveal that the use of radiometric features yields better classification results than the use of geometric features for the considered dataset. Furthermore, the consideration of data on both modalities leads to an improvement of the classification results. However, the derived results also indicate that the use of all defined features is less favorable than the use of selected features. Consequently, data representations derived via feature extraction and feature selection techniques still provide a gain if used as the basis for deep semantic segmentation.

Highlights

  • The semantic segmentation of aerial imagery in terms of assigning a semantic label to each pixel and thereby providing meaningful segments has been addressed in the scope of many recent investigations and applications

  • The semantic segmentation of aerial imagery based on true orthophotos and the corresponding Digital Surface Models (DSMs) can be achieved via the extraction of hand-crafted features and the use of standard classifiers such as Random Forests (Gerke and Xiao, 2014; Weinmann and Weinmann, 2018) or Conditional Random Fields (CRFs) (Gerke, 2014)

  • For 16 patches, a very high-resolution true orthophoto and the corresponding DSM derived via dense image matching techniques are provided as well as a reference labeling with respect to six semantic classes represented by Impervious Surfaces, Building, Low Vegetation, Tree, Car and Clutter/Background

Read more

Summary

Introduction

The semantic segmentation of aerial imagery in terms of assigning a semantic label to each pixel and thereby providing meaningful segments has been addressed in the scope of many recent investigations and applications. Many investigations rely on the use of modern deep learning techniques which tend to significantly improve the classification results (Sherrah, 2016; Liu et al, 2017; Audebert et al, 2016; Audebert et al, 2017; Chen et al, 2018; Marmanis et al, 2016; Marmanis et al, 2018) Some of these approaches focus on using hand-crafted features derived from the true orthophotos or from their corresponding DSMs in addition to the given data as input to a deep learning technique. Other kinds of hand-crafted features have only rarely been involved so far they might introduce valuable information for the semantic labeling task

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call