In the context of defect detection in high-speed railway train wheels, particularly in ultrasonic-testing B-scan images characterized by their small size and complexity, the need for a robust solution is paramount. The proposed algorithm, UT-YOLO, was meticulously designed to address the specific challenges presented by these images. UT-YOLO enhances its learning capacity, accuracy in detecting small targets, and overall processing speed by adopting optimized convolutional layers, a special layer design, and an attention mechanism. This algorithm exhibits superior performance on high-speed railway wheel UT datasets, indicating its potential. Crucially, UT-YOLO meets real-time processing requirements, positioning it as a practical solution for the dynamic and high-speed environment of railway inspections. In experimental evaluations, UT-YOLO exhibited good performance in best recall, mAP@0.5 and mAP@0.5:0.95 increased by 37%, 36%, and 43%, respectively; and its speed also met the needs of real-time performance. Moreover, an ultrasonic defect detection data set based on real wheels was created, and this research has been applied in actual scenarios and has helped to greatly improve manual detection efficiency.