The application of hyperspectral image (HSI) is more and more extensive, but the lower spatial resolution seriously affects its application effect. Using low-resolution hyperspectral image (LR-HSI) and high-resolution multispectral image (MSI) fusion technology to achieve super-resolution reconstruction of HSI has become a mainstream method. However, most of the existing fusion methods do not make full use of the large-scale range of remote sensing images, and neglect the preservation of spatial-spectral information in the fusion process. Considering that the spectral information in fused high-resolution hyperspectral image (HR-HSI) mainly depends on HSI, and the spatial information mainly depends on MSI, this paper proposes a full-scale linked Unet with spatial-spectral joint perceptual attention for hyperspectral and multispectral image fusion (FSL-Unet). The FSL-Unet consists of two modules, the first is spatial-spectral attention extraction module (SSAE), which is used to calculate the spectral attention of LR-HSI and the spatial attention of HR-MSI at different scales. The second is the full-scale link U-shaped fusion module (FLUF), which adopts a multi-level feature extraction strategy, using denser full-scale skip connections to explore feature information in a finer-grained range, enabling flexible combination of multi-scale and multi-path features. At the same time, we propose spatial-spectral joint peceptual attention (SSJPA) on the encoder side of FLUF. SSJPA can make full use of the attention maps computed by the SSAE, and then effectively embed spatial and spectral information into the fused image, enabling uninterrupted information transfer and aggregation. To demonstrate the effectiveness of FSL-Unet, we selected five public hyperspectral datasets for experiments. Compared with other eight state-of-the-art fusion methods, the experimental results show that the FSL-Unet achieves competitive results. The source code for FSL-Unet can be downloaded from https://github.com/wxy11-27/FSL-Unet.