In the process of steel strip production, the accuracy of defect detection remains a challenge due to the diversity of defect types, complex backgrounds, and noise interference. To improve the effectiveness of surface defect detection in steel strips, we propose an enhanced detection model known as YOLOv8-BSPB. First, we propose a novel pooling layer module, SCRD, which replaces max pooling with average pooling. This module introduces the receptive field block (RFB) and deformable convolutional network version 4 (DCNv4) to obtain learnable offsets, allowing convolutional kernels to flexibly move and deform on the input feature map, thus, more effectively extracting multi-scale features. Second, we integrate a polarized self-attention (PSA) mechanism to improve the model’s feature representation and enhance its ability to focus on relevant information. Additionally, we incorporate the BAM attention mechanism after the C2f module to strengthen the model’s feature selection capabilities. A bidirectional feature pyramid network is introduced at the neck of the model to improve feature transmission efficiency. Finally, the WIoU loss function is employed to accelerate the model’s convergence speed and enhance regression accuracy. Experimental results on the NEU-DET dataset demonstrate that the improved model achieves a classification accuracy of 81.3%, an increase of 4.9% over the baseline, with a mean average precision of 86.9%. The model has a parameter count of 5.5 M and operates at 103.1 FPS. To validate the model’s effectiveness, we conducted tests on the Kaggle steel strip dataset and our custom dataset, where the average accuracy improved by 2.3% and 5.5%, respectively. The experimental results indicate that the model meets the requirements for real-time, lightweight, and portable deployment.
Read full abstract