Abstract

Spatiotemporal fusion provides an effective way to fuse two types of remote sensing data featured by complementary spatial and temporal properties (typical representatives are Landsat and MODIS images) to generate fused data with both high spatial and temporal resolutions. This paper presents a very deep convolutional neural network (VDCN) based spatiotemporal fusion approach to effectively handle massive remote sensing data in practical applications. Compared with existing shallow learning methods, especially for the sparse representation based ones, the proposed VDCN-based model has the following merits: (1) explicitly correlating the MODIS and Landsat images by learning a non-linear mapping relationship; (2) automatically extracting effective image features; and (3) unifying the feature extraction, non-linear mapping, and image reconstruction into one optimization framework. In the training stage, we train a non-linear mapping between downsampled Landsat and MODIS data using VDCN, and then we train a multi-scale super-resolution (MSSR) VDCN between the original Landsat and downsampled Landsat data. The prediction procedure contains three layers, where each layer consists of a VDCN-based prediction and a fusion model. These layers achieve non-linear mapping from MODIS to downsampled Landsat data, the two-times SR of downsampled Landsat data, and the five-times SR of downsampled Landsat data, successively. Extensive evaluations are executed on two groups of commonly used Landsat–MODIS benchmark datasets. For the fusion results, the quantitative evaluations on all prediction dates and the visual effect on one key date demonstrate that the proposed approach achieves more accurate fusion results than sparse representation based methods.

Highlights

  • One of the fundamental features of remote sensing data is the resolution in spatial, spectral, temporal, and radiometric domains

  • For description convenience the proposed very deep convolutional network based spatiotemporal fusion method is abbreviated as VDCNSTF

  • This demonstrated that our method could better leverage the difficult land-cover type changes than CNNSTF, which may be attributed to the fact that our deep learning model could better correlate MODIS and Landsat images than the shallow learning model, and the very deep convolutional neural network (VDCN) based multi-scale super-resolution (MSSR) model had higher prediction accuracy than the one-step super-resolution model in CNNSTF

Read more

Summary

Introduction

One of the fundamental features of remote sensing data is the resolution in spatial, spectral, temporal, and radiometric domains. The images from sensors of Landsat TM/ETM+ and SPOT/HRV are featured by high-spatial resolutions of 10–30 m but low-temporal resolutions from a half to one month, while the images from sensors of MODIS, AVHRR, and MERIS are characterized by low-spatial resolutions of 250–1000 m but daily temporal coverage. These remote sensing data are complementary in both spatial and temporal resolutions. We introduce several representative works for each category

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call