Abstract

The acquisition of high-resolution satellite and airborne remote sensing images has been significantly simplified due to the rapid development of sensor technology. Several practical applications of high-resolution remote sensing images (HRRSIs) are based on semantic segmentation. However, single-modal HRRSIs are difficult to classify accurately in the complex situation of some scene objects; therefore, the semantic segmentation of multi-source information fusion is gaining popularity. The inherent difference between multimodal features and the semantic gap between multi-level features typically affect the performance of existing multi-mode fusion methods. We propose a multimodal fusion network based on edge detection to address these issues. This method aids multimodal information fusion by utilizing spatial information contained in the boundary. An edge detection guide module is included in the feature extraction stage to realize the boundary information through the fusion of details and semantics between high-level and low-level features. The boundary information is extended into the well-designed multimodal adaptive fusion block (MAFB) to obtain the multimodal fusion features. Furthermore, a residual adaptive fusion block (RAFB) and a spatial position module (SPM) in the feature decoding stage have been designed to fuse multi-level features from the standpoint of local and global dependence. We compared our method to several state-of-the-art (SOTA) methods using the International Society for Photogrammetry and Remote Sensing’s (ISPRS) Vaihingen and Potsdam datasets. The final results demonstrate that our method achieves excellent performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call