Quality control processes with automation ensure that customers receive defect-free products that meet their needs. However, the performance of real-world surface defect detection is often severely hindered by the scarcity of data. Recently, few-shot learning has been widely proposed as a solution to the data sufficiency problem by leveraging a limited number of base class samples. However, achieving discriminative and generalization capabilities with few samples remains a challenging task in various surface defect detection scenarios. In this paper, we propose a sparse cross-transformer network (SCTN) for surface defect detection. Specifically, we introduce a residual layer module to enhance the network's ability to retain crucial information. Next, we propose a sparse layer module within the cross-transformer to increase computational efficiency. Finally, we incorporate a squeeze-and-excitation network into the cross-transformer to enhance the attention mechanism between local patches outputted by the transformer encoder. To verify the effectiveness of our proposed method, we conducted extensive experiments on the cylinder liner defect dataset, the NEU steel surface defect dataset, and the PKU-Market-PCB dataset, achieving the best mean average precision of 62.73%, 85.29%, and 88.7%, respectively. The experimental results demonstrate that our proposed method achieves significant improvements compared to state-of-the-art algorithms. Additionally, the results indicate that SCTN enhances the network's discriminative ability and effectively improves generalization across various surface defect detection tasks.
Read full abstract