Abstract

Remote sensing experts have been actively using deep neural networks to solve extraction tasks in high-resolution aerial imagery by means of supervised semantic segmentation operations. However, the extraction operation is imperfect, due to the complex nature of geospatial objects, limitations of sensing resolution, or occlusions present in the scenes. In this work, we tackle the challenge of postprocessing semantic segmentation predictions of road surface areas obtained with a state-of-the-art segmentation model and present a technique based on generative learning and image-to-image translations concepts to improve these initial segmentation predictions. The proposed model is a conditional Generative Adversarial Network based on Pix2pix, heavily modified for computational efficiency (92.4% decrease in the number of parameters in the generator network and 61.3% decrease in the discriminator network). The model is trained to learn the distribution of the road network present in official cartography, using a novel dataset containing 6784 tiles of 256 × 256 pixels in size, covering representative areas of Spain. Afterwards, we conduct a metrical comparison using the Intersection over Union (IoU) score (measuring the ratio between the overlap and union areas) on a novel testing set containing 1696 tiles (unseen during training) and observe a maximum increase of 11.6% in the IoU score (from 0.6726 to 0.7515). In the end, we conduct a qualitative comparison to visually assess the effectiveness of the technique and observe great improvements with respect to the initial semantic segmentation predictions.

Highlights

  • Sensed images have been used lately by researchers in machine vision applications such as object identification [1,2], detection [3], or extraction [4]

  • Similar problems are still observed in recent works dealing with the road extraction from high-resolution aerial imagery—improving the road extraction task is an active area of research [11,12,13,14]

  • To overcome the deficiencies observed in our previous work [8], we developed a postprocessing technique based on image-to-image translation [15] concepts to operate over the initial semantic segmentation predictions and improve the road surface extraction for automatic mapping purposes

Read more

Summary

Introduction

Sensed images have been used lately by researchers in machine vision applications such as object identification [1,2], detection [3], or extraction [4]. Deep learning algorithms proved to be useful for classification tasks and land use analysis [5] in satellite imagery data [6,7]—an important remote sensing application, where semantic segmentation techniques (based on supervised learning) are applied to assign a land cover class to every pixel of an image This extraction task is generally carried out by means semantic segmentation and can be considered very challenging due to complex nature of geospatial objects, due to defects present in imagery (noise, occlusions, etc.), due to imperfections in the ground-truth segmentation masks or due to particularities of the segmentation algorithms applied. We study the appropriateness of applying generative learning techniques for postprocessing initial semantic segmentation predictions of road surface areas by conducting a metrical (IoU score) comparison and a perceptual validation on a new test set composed of 1696 real segmentation maps and their correspondent semantic segmentation.

Related Works
Problem Description
Discriminator
Findings
Metrical Analysis and Perceptual Validation of the Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.