Abstract

As two different tools for earth observation, the optical and synthetic aperture radar (SAR) images can provide complementary information of the same land types for better land cover classification. However, because of the different imaging mechanisms of optical and SAR images, how to efficiently exploit the complementary information becomes an interesting and challenging problem. In this article, we propose a novel multimodal bilinear fusion network (MBFNet), which is used to fuse the optical and SAR features for land cover classification. The MBFNet consists of three components: the feature extractor, the second-order attention-based channel selection module (SACSM), and the bilinear fusion module. First, in order to avoid the network parameters tempting to ingratiate dominant modality, the pseudo-siamese convolutional neural network (CNN) is taken as the feature extractor to extract deep semantic feature maps of optical and SAR images, respectively. Then, the SACSM is embedded into each stream, and the fine channel-attention maps with second-order statistics are obtained by bilinear integrating the global average-pooling and global max-pooling information. The SACSM can not only automatically highlight the important channels of feature maps to improve the representation power of networks, but also uses the channel selection mechanism to reconfigure compact feature maps with better discrimination. Finally, the bilinear pooling is used as the feature-level fusion method, which establishes the second-order association between two compact feature maps of the optical and SAR streams to obtain the low-dimension bilinear fusion features for land cover classification. Experimental results on three broad coregistered optical and SAR datasets demonstrate that our method achieves more effective land cover classification performance than the state-of-the-art methods.

Highlights

  • L AND cover classification plays an important role in landuse analysis, environment protection, urban planning, etc., Manuscript received September 7, 2019; revised December 26, 2019 and January 31, 2020; accepted February 15, 2020

  • For the multimodal bilinear fusion network (MBFNet), we find that the following three crucial hyperparameters have the obvious impact on the land cover classification experiments: the size of patches h × w, the number of selected important channels q, and the reduction ratio s in the second-order attention-based channel selection module (SACSM)

  • We know that kappa coefficient (Kappa) and overall accuracy (OA) of the MBFNet are superior to other comparison methods

Read more

Summary

Introduction

L AND cover classification plays an important role in landuse analysis, environment protection, urban planning, etc., Manuscript received September 7, 2019; revised December 26, 2019 and January 31, 2020; accepted February 15, 2020. Most of existing land cover classification methods only use unimodal remote sensing (RS) images, e.g., many methods using the optical images have faced the spectral confusion issue to lower the classification accuracy [2], [3], and others using the synthetic aperture radar (SAR) images have presented poor classification because of the quality of SAR images and noise interference [4]. With the rapid development of RS techniques, it is possible to obtain multimodal RS data from the same region, and the optical and SAR images can provide a variety of information on the land properties, such as the spectral information of the optical images [5] and the scattering information of SAR images [4], [6]. Numerous studies have shown that the optical and SAR data can provide complementary information from individual sources, which is good for land cover classification [7], [8]. The SAR patches between the tree and unknown are similar, whereas their optical patches have

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call