Abstract

The key to hyperspectral image (HSI) and multispectral image (MSI) fusion is to take advantage of the properties of interspectra self-similarities of HSIs and spatial correlations of MSIs. However, leading convolutional neural network (CNN)-based methods show shortcomings in capturing long-range dependencies and self-similarity prior. To this end, we propose a simple yet efficient Transformer-based network, hyperspectral and multispectral image fusion (HMF)-Former, for the HSI/MSI fusion. The HMF-Former adopts a U-shaped architecture with a spatio-spectral Transformer block (SSTB) as the basic unit. In the SSTB, embedded spatial-wise multihead self-attention (Spa-MSA) and spectral-wise multihead self-attention (Spe-MSA) effectively capture interactions of spatial regions and interspectra dependencies, respectively. They are consistent with the properties of spatial correlations of MSIs and interspectra self-similarities of HSIs. In addition, specially designed SSTB enables the HMF-Former to capture both local and global features while maintaining linear complexity. Extensive experiments on four benchmark datasets show that our method significantly outperforms state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.