Existing deep learning methods for rail surface defect detection face issues such as poor compatibility with embedded detection systems, high computational resource consumption, and slow detection speeds. To address these challenges, we propose a lightweight rail surface defect detection algorithm11Code: https://github.com/haichao67/GD-YOLOv8. based on an improved YOLOv8.Inspired by Huawei’s Gold-YOLO, our algorithm redesigns the neck architecture, integrates feature extraction, fusion, and injection modules, and replaces Complete Intersection over Union with Scylla Intersection over Union. Comparative experiments confirm its efficacy, achieving 94.9% precision and 81.9% recall on the Rail Surface Defect Dataset, a 26.3% increase in frames per second, and a model size reduction to 63%. Our algorithm not only ensures satisfactory detection results but also reduces the number of model parameters and increases the detection speed.