Magnetic Particle Imaging (MPI) can visualize the concentration distribution of superparamagnetic iron-oxide nanoparticles (SPIONs) in tissues with the advantages of high sensitivity and high temporal resolution. However, the low spatial resolution of MPI limits its application. Increasing the gradient strength of the selection field can improve the resolution of MPI, but also increase power consumption and noise. A feasible and cost-effective method to address this limitation is to reconstruct high gradient (HG) image from low gradient (LG) image using algorithms. Deep learning has been a powerful tool for improving the resolution of medical imaging techniques. In this study, we propose a Resolution Enhancement Transformer Network (RETNet) for reconstructing HG image with high-resolution from LG image with low-resolution as input, avoiding high power consumption and high noise in the system with HG field. RETNet leverages a shallow feature extractor to capture shallow features, a cross-scale-Transformer (CST) to focus on textural features, a residual-swin-Transformer (RST) to focus on structural features, and an image reconstruction module to aggregate these three types of features and reconstruct the HG image. Textural and structural features extracted can ensure the integrity of the details and the realization of high definition in the reconstructed image. Ablation experiments demonstrate the significant contribution of these two modules to reconstruct the HG image. Comparative experiments, including experiments at noise-free and multiple noise levels, confirm the high robustness of RETNet. Simulation, phantom, and in vivo experiments consistently demonstrate that RETNet outperforms competing methods and effectively improves the resolution of MPI.