Abstract

The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a <i>L</i><sub>1</sub>-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN’s texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

Highlights

  • Object classification analysis is a very important topic in urban remote sensing

  • The major goal of this work is to perform a workflow for semantic labelling in city areas using multi-spectral aerial imagery and Digital Surface Models (DSM), which is based on combining a convolutional neural networks (CNN) image categorization scheme with conventional pixel-based

  • The conditional probability theory and DS theory is applied to combining CNN and LR inference probabilities

Read more

Summary

Introduction

The results of such research are appealing for a wide range of data modeling tasks across diverse applications including city mapping, urban environment assessment and road inventory. In the computer vision area, CNN features have been shown to outperform conventional hand-crafted features in visual recognition tasks such as image classification (Razavian et al, 2014) and object detection (Girshick et al, 2014), making it among the most promising architectures for vision applications. It seems that CNNs roughly mimic the nature of the mammalian visual cortex and exploit the strong spatially local correlation present in natural images. A deep CNN that consists of multiple layers of small neuron collections offers an alternative efficient approach to learn visual patterns directly from raw pixels

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call