Accurate pore-scale modeling demands high-quality digital rock images, which should possess a broad imaging field of view (FOV) and high resolution to characterize multi-scale rock components. However, achieving both conditions simultaneously is challenging due to hardware constraints. Super-resolution techniques can mitigate this issue by reconstructing high-resolution details from low-resolution images captured with a wide FOV. To reconstruct high-quality 3D digital rocks, we propose a novel Efficient Attention Super-Resolution Transformer (EAST) model. It integrates self-attention and channel attention mechanisms and undergoes structural optimization. Evaluation demonstrates that EAST achieves superior reconstruction quality with a 1.85 × speedup while reducing parameters by 78 % over the advanced model. Additionally, to tailor to the characteristics of digital rocks and the constrained dataset, we employ a hybrid loss function along with two data augmentation techniques. Visualizations reveal that EAST can resist noise and blur interference, highlighting valuable features such as pore edges and textures. Ultimately, we introduce an approach based on self-supervised fine-tuning to enhance the model robustness. Direct flow simulation verifies that the reconstructed results closely align with high-resolution images in terms of physical accuracy. EAST reduces the relative error of the absolute permeability by 18.5 % and 33 % over the Tricubic method on two external samples.