Due to the limitations of hardware equipment, infrared images generally have problems such as low resolution, blurred details and poor visual quality during acquisition. Using visible light images to guide the super-resolution reconstruction of infrared images is an effective way to improve the resolution of infrared images. However, the imaging principles of visible light images and infrared images are different, resulting in differences in detail information between the two images, so problems such as blur and ghosting may occur during reconstruction. This paper proposes an infrared image super-resolution network based on visible light image guidance and recursive fusion. In this network, a flow Fourier residual module is designed, and modules of different depths are used to extract information of different frequencies in visible light images and infrared images, so that each module focuses on the appropriate frequency information. At the same time, a hybrid attention module is used to obtain detail information in multimodal images from channel and spatial perspectives, and fuse them in a complementary way, which helps to eliminate the generation of artifacts. On this basis, a global recursive fusion branch is designed to consider the correlation between multi-layer features, adaptively fuse multi-layer features, and generate clearer high-resolution infrared images. Experimental results show that compared with the comparative methods, this method performs better in objective evaluation indicators; in terms of subjective visual comparison, the images reconstructed by this method have clearer textures and fewer artifacts, and better object distinction in complex environments.