Abstract

Precise crop mapping is crucial for guiding agricultural production, forecasting crop yield, and ensuring food security. Integrating optical and synthetic aperture radar (SAR) satellite image time series (SITS) for crop classification is an essential and challenging task in remote sensing. Previously published studies generally employ a dual-branch network to learn optical and SAR features independently, while ignoring the complementarity and correlation between the two modalities. In this article, we propose a novel method to learn optical and SAR features for crop classification through cross-modal contrastive learning. Specifically, we develop an updated dual-branch network with partial weight-sharing of the two branches to reduce model complexity. Furthermore, we enforce the network to map features of different modalities from the same class to nearby locations in a latent space, while samples from distinct classes are far apart, thereby learning discriminative and modality-invariant features. We conducted a comprehensive evaluation of the proposed method on a large-scale crop classification dataset. Experimental results show that our method consistently outperforms traditional supervised learning approaches, no matter the training samples are adequate or not. Our findings demonstrate that unifying the representations of optical and SAR image time series enables the network to learn more competitive features and suppress inference noise.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call