Abstract

Large-scale crop mapping from dense time-series images is a difficult task and becomes even more challenging with the cloud coverage. Current deep learning models frequently represent time series from a single perspective, which is insufficient to obtain fine-grained details. Meanwhile, the impact of cloud noise on deep learning models is not yet fully understood. In this study, a Multi-scale Temporal Transformer-Conv network (Ms-TTC) is proposed for robust crop mapping under frequently clouds. The Ms-TTC enhances temporal representations by effectively combining the global modeling capability of self-attention with the local capture capability of convolutional neural network (CNN) at multi-temporal scales. The Ms-TTC network consists of three main components: (1) a temporal encoder module that explores global and local temporal relationships at multi-temporal scales, (2) an attention-based fusion module that effectively fuses multi-scale temporal features, and (3) the output module that concatenates the high-level time series features and refined multi-scale features to predict the label. The proposed model demonstrated superior performance compared to state-of-the-art methods on the large-scale time series dataset, FranceCrops, achieving a minimum improvement of 2% in mF1 scores. Subsequently, gradient back-propagation-based feature importance analysis was used to investigate the behavior of deep learning models for processing time series data with cloud noise. The results revealed that most deep learning models can suppress cloudy observations to some degree, and models with a global field of view had superior cloud masking but also lost some local temporal information. Clouds can influence the model's attention towards the spectral dimension, particularly affecting the visible and vegetation red-edge bands, which exhibit higher sensitivity to cloud noise and play a crucial role to performance. This study provides a feasible approach for large-scale dynamic crop mapping independently of cloudy conditions by combining global-local temporal representations at multi-scales.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.