Many ophthalmic and systemic diseases can be screened by analyzing retinal fundus images. The clarity and resolution of retinal fundus images directly determine the effectiveness of clinical diagnosis. Deep learning methods based on generative adversarial networks are used in various research fields due to their powerful generative capabilities, especially image super-resolution. Although Real-ESRGAN is a recently proposed method that excels in processing real-world degraded images, it suffers from structural distortions when super-resolving retinal fundus images are rich in structural information. To address this shortcoming, we first process the input image using a pre-trained U-Net model to obtain a structural segmentation map of the retinal vessels and use the segmentation map as the structural prior. The spatial feature transform layer is then used to better integrate the structural prior into the generation process of the generator. In addition, we introduce channel and spatial attention modules into the skip connections of the discriminator to emphasize meaningful features and accordingly enhance the discriminative power of the discriminator. Based on the original loss functions, we introduce the L1 loss function to measure the pixel-level differences between the segmentation maps of retinal vascular structures in the high-resolution images and the super-resolution images to further constrain the super-resolution images. Simulation results on retinal image datasets show that our improved algorithm results have a better visual performance by suppressing structural distortions in the super-resolution images.