Abstract

Training Convolutional Neural Networks (CNNs) for very high resolution images requires a large quantity of high-quality pixel-level annotations, which is extremely labor- and time-consuming to produce. Moreover, professional photo interpreters might have to be involved for guaranteeing the correctness of annotations. To alleviate such a burden, we propose a framework for semantic segmentation of aerial images based on incomplete annotations, where annotators are asked to label a few pixels with easy-to-draw scribbles. To exploit these sparse scribbled annotations, we propose the FEature and Spatial relaTional regulArization (FESTA) method to complement the supervised task with an unsupervised learning signal that accounts for neighbourhood structures both in spatial and feature terms.

Highlights

  • S EMANTIC segmentation of remote sensing imagery aims at identifying the land-cover or land-use category of each pixel in an image

  • To alleviate the requirement of dense pixelwise annotations, semisupervised learning approaches are proposed to make use of additional information, such as spatial relations or feature-level relations, for semantic segmentation

  • The Vaihingen data set2 is a benchmark data set for semantic segmentation provided by the International Society for Photogrammetry and Remote Sensing (ISPRS); 33 aerial images with a spatial resolution of 9 cm were collected over the city of Vaihingen, and each image covers an average area of 1.38 km2

Read more

Summary

INTRODUCTION

S EMANTIC segmentation of remote sensing imagery aims at identifying the land-cover or land-use category of each pixel in an image. To alleviate the requirement of dense pixelwise annotations, semisupervised learning approaches are proposed to make use of additional information, such as spatial relations (e.g., neighboring pixels are likely to belong to the same class) or feature-level relations (e.g., pixels with similar CNN feature representations are likely to belong to the same class), for semantic segmentation. These methods aim to utilize low-cost annotations, such as points [2], scribbles [3], [4], or image-level labels [5], [6]. To demonstrate the effectiveness of our learning framework, extensive experiments are conducted on two very high resolution (VHR) data sets: the Vaihingen and Zurich Summer

Supervision With Sparse Annotations
Feature and Spatial Relational Regularization
CRF for Boundary Refinement
Data Set Description
Scribbled Annotation Generation
Comparing With Existing Methods
Discussion on Annotation Type
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call