Abstract

High-resolution (HR) satellite images, due to the technical constraints on spectral and spatial resolutions, usually contain only several broad spectral bands but with a very high spatial resolution. This provides rich spatial details of the objects on the Earth surface, while their spectral discrimination is relatively low. Recently, the increase of the satellite revisit times made it possible to acquire more frequent data coverage for finer classification. In this article, we proposed a novel multitemporal deep fusion network (MDFN) for short-term multitemporal HR images classification. Specifically, a two-branch structure of MDFN is designed, which includes a long short-term memory (LSTM) and a convolutional neural network (CNN). The LSTM branch is mainly used to learn the joint expression of different temporal-spectral features. For the CNN branch, the three-dimensional (3-D) convolution is firstly applied along the temporal and spectral dimensions to jointly learn the temporal-spatial and spectral-spatial information, respectively, and then the 2-D convolution is performed along the spatial dimension to further extract the spatial context information. Finally, features generated from the two different branches are fused to obtain the discriminative high-level semantic information for classification. Experimental results carried on two real multitemporal HR remote sensing datasets demonstrate that the proposed MDFN provides better classification performance over the state-of-the-art methods, and it also shows the potentiality to use short-term multitemporal HR images for more accurate land use/land cover mapping.

Highlights

  • WITH the rapid development of the Earth Observation (EO) technology, high resolution or even very-high-resolution (VHR) satellites (e.g., Pleiades, Gaofen, and Worldview series) have been launched with a revisit time about 1 to 5 days

  • From the fluctuations of monotemporal (T1-T3) and multitemporal (TM) curves, one can see that the overall accuracy (OA) of the multitemporal images classification is higher than the one of monotemporal image classification, and relatively less affected by random sampling, exhibiting a more stable performance

  • By comparing multitemporal results obtained by different methods, the proposed multitemporal deep fusion network (MDFN) approach with the highest OA values and lowest fluctuating deviations exhibits the best performance

Read more

Summary

INTRODUCTION

WITH the rapid development of the Earth Observation (EO) technology, high resolution or even very-high-resolution (VHR) satellites (e.g., Pleiades, Gaofen, and Worldview series) have been launched with a revisit time about 1 to 5 days (see TABLE I). There were many misclassification errors because the fact that they did not further explore the invariant temporal-spectral features to suppress the abrupt or abnormal changes on each monotemporal image To overcome these drawbacks, in this paper, a novel framework named Multitemporal Deep Fusion Network (MDFN) is proposed for dealing with the short-term multitemporal HR images classification, where long short-term memory (LSTM) and CNN branches are combined to extract and fuse rich spatio-temporal-spectral features. 3) The 2D convolutions based on concatenating two types of 3D convolution features are designed to capture the spatial context information This guarantees a strong descriptive capability for the invariant spatio-temporal-spectral features that contribute to the improved classification. The specific formulas are defined as: it (Wi[ht 1, xt ] bi )

RELATED WORK
PROPOSED MDFN FRAMEWORK
LSTM Branch
CNN Branch
Multi-level Feature Fusion and Classification
Data Sets Descriptions
Experimental Setup and Parameter Settings
Classification Performance
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call