Abstract

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for the SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data are scarce. To address this problem, we propose a novel self-supervised pretraining scheme to initialize a transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pretraining is completed, the pretrained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed pretraining scheme, leading to substantial improvements in classification accuracy using transformer, 1-D convolutional neural network, and bidirectional long short-term memory network. The code and the pretrained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.

Highlights

  • N OWADAYS, a huge volume of Earth observation (EO) data are being accumulated thanks to remarkableManuscript received September 15, 2020; revised October 22, 2020; accepted November 3, 2020

  • We compared the performance of different evaluated algorithms with the following metrics derived from the confusion matrix [18]: 1) Overall Accuracy (OA): This metric represents the proportion of correctly classified samples in all tested samples, and is computed by dividing the number of correctly classified samples by the total number of test samples

  • To assess the effectiveness of the proposed network, we compared it with five methods that are widely employed in satellite image time series (SITS) classification

Read more

Summary

INTRODUCTION

N OWADAYS, a huge volume of Earth observation (EO) data are being accumulated thanks to remarkable. YUAN AND LIN: SELF-SUPERVISED PRETRAINING OF TRANSFORMERS FOR SATELLITE IMAGE TIME SERIES CLASSIFICATION temporal characteristics from spectral profiles To exploit both spatial and temporal information of high-resolution SITS, hybrid architectures combining convolutional and recurrent layers [24], [25] and convolutional-recurrent neural networks [26] have been introduced and comprehensively compared [27]. The central idea behind our pretext task is to leverage the inherent temporal structure of satellite time series to capture meaningful spectral-temporal characteristics from a large volume of SITS data, and that these characteristics are closely related to the natural changes at the Earth’s surface In this way, enormous amounts of background knowledge are accumulated in the network through pretraining, making the model “understand” what satellite time series should look like.

RELATED WORK
MOTIVATION
Overall Network Architecture
Observation Embedding
Transformer Encoder
Pretraining SITS-BERT
Fine-Tuning SITS-BERT
STUDY AREAS AND DATASETS
Pretraining Dataset
Two Crop Classification Datasets
Land Cover Mapping Dataset
Data Collection and Preprocessing
Evaluation Criteria and Methods
Model Configuration
Effect of the Pretraining Scheme on Other Models
Influence of the Number of Labeled Samples
Computational Efficiency
VIII. CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call