Abstract

Existing deep learning-based remote sensing images semantic segmentation methods require large-scale labeled datasets. However, the annotation of segmentation datasets is often too time-consuming and expensive. To ease the burden of data annotation, self-supervised representation learning methods have emerged recently. However, the semantic segmentation methods need to learn both high-level and low-level features, but most of the existing self-supervised representation learning methods usually focus on one level, which affects the performance of semantic segmentation for remote sensing images. In order to solve this problem, we propose a self-supervised multitask representation learning method to capture effective visual representations of remote sensing images. We design three different pretext tasks and a triplet Siamese network to learn the high-level and low-level image features at the same time. The network can be trained without any labeled data, and the trained model can be fine-tuned with the annotated segmentation dataset. We conduct experiments on Potsdam, Vaihingen dataset, and cloud/snow detection dataset Levir_CS to verify the effectiveness of our methods. Experimental results show that our proposed method can effectively reduce the demand of labeled datasets and improve the performance of remote sensing semantic segmentation. Compared with the recent state-of-the-art self-supervised representation learning methods and the mostly used initialization methods (such as random initialization and ImageNet pretraining), our proposed method has achieved the best results in most experiments, especially in the case of few training data. With only 10% to 50% labeled data, our method can achieve the comparable performance compared with random initialization. Codes are available at https://github.com/flyakon/SSLRemoteSensing.

Highlights

  • T HE rapid development of remote sensing technology has greatly widened the scope of exploring the earth

  • The results show that our method outperforms other recent self-supervised representation learning methods [24, 32, 33] in the semantic segmentation task

  • Our method can achieve the comparable performance with only 50% labeled data on Vaihingen dataset and 20% labeled data on Potsdam dataset compared with random initialization, while only 20% labeled data for cloud detection and 10% labeled data for snow detection are needed to achieve the comparable performance

Read more

Summary

Introduction

T HE rapid development of remote sensing technology has greatly widened the scope of exploring the earth. Recent FCN based semantic segmentation methods for remote sensing images still rely on training with a large number of manually annotated data. There are some annotated datasets available, most of remote sensing data from the Internet are not labeled that adapts to semantic segmentation task. These unlabeled data have no effect on improving the semantic segmentation of remote sensing images. The purpose of this paper is to design an effective pre-training method with unlabeled data to improve the effect of remote sensing semantic segmentation

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call