To address the issues of the Real-Time DEtection TRansformer (RT-DETR) object detection model, including poor defect feature extraction in the task of rail fastener defect detection, inefficient use of computational resources, and suboptimal channel attention in the self-attention mechanism, the following improvements were made. Firstly, a Super-Resolution Convolutional Module (SRConv) was designed as a separate component and integrated into the Backbone network, which enhances the image details and clarity while preserving the original image structure and semantic content. This integration improves the model’s ability to extract defect features. Secondly, a channel attention mechanism was integrated into the self-attention module of RT-DETR to enhance the focus on feature map channels, addressing the problem of sparse attention maps caused by the lack of channel attention while saving computational resources. Finally, the experimental results show that compared to the original model, the improved RT-DETR-based rail fastener defect detection algorithm, with an additional 0.4 MB of parameters, achieved a higher accuracy, with a 2.8 percentage point increase in the Mean Average Precision (mAP) across IoU thresholds from 0.5 to 0.9 and a 1.7 percentage point increase in the Average Recall (AR) across the same thresholds.