Abstract

To obtain a high-resolution hyperspectral image (HR-HSI), fusing a low-resolution hyperspectral image (LR-HSI) and a high-resolution multispectral image (HR-MSI) is a prominent approach. Numerous approaches based on convolutional neural networks (CNNs) have been presented for hyperspectral image (HSI) and multispectral image (MSI) fusion. Nevertheless, these CNN-based methods may ignore the global relevant features from the input image due to the geometric limitations of convolutional kernels. To obtain more accurate fusion results, we provide a spatial-spectral transformer-based U-net (SSTF-Unet). Our SSTF-Unet can capture the association between distant features and explore the intrinsic information of images. More specifically, we use the spatial transformer block (SATB) and spectral transformer block (SETB) to calculate the spatial and spectral self-attention, respectively. Then, SATB and SETB are connected in parallel to form the spatial-spectral fusion block (SSFB). Inspired by the U-net architecture, we build up our SSTF-Unet through stacking several SSFBs for multiscale spatial-spectral feature fusion. Experimental results on public HSI datasets demonstrate that the designed SSTF-Unet achieves better performance than other existing HSI and MSI fusion approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.