Abstract

Deep convolution neural network (DCNN) is among the most effective ways of performing land use classification of high-resolution remote sensing images. Land use classification by fusing optical and synthetic aperture radar (SAR) images has broad application prospects, but related research studies are few. In this study, we developed the first and largest joint optical and SAR land use classification dataset, WHU-OPT-SAR, covering an area of approximately 50,000 km2, and designed a multimodal-cross attention network (MCANet). MCANet comprises three core modules: the pseudo-siamese feature extraction module, multimodal-cross attention module, and low-high level feature fusion module, which are used for independent feature extraction of optical and SAR images, second-order hidden feature mining, and multi-scale feature fusion. The land use classification accuracy of our approach on the WHU-OPT-SAR dataset was approximately 5% higher than that of optic-image-based approaches. Moreover, the accuracy of city,village,road,water,forest, and farmland classification was improved by 7%, 2%, 5%, 6%, 1%, and 0.6%, respectively, reflecting the superior performance of fusing optical and SAR images. Furthermore, the classification accuracy in Hubei Province of China, which covers an area of 190,000 km2, has also increased by approximately 5%, which verifies the effectiveness of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call