Abstract Surface defect detection on wafers is crucial for quality control in semiconductor manufacturing. However, the complexity of defect spatial features, including mixed defect types, large scale differences, and overlapping, results in low detection accuracy. In this paper, we propose a CC-De-YOLO model, which is based on the YOLOv7 backbone network. Firstly, the coordinate attention is inserted into the main feature extraction network. Coordinate attention decomposes channel attention into two one-dimensional feature coding processes, which are aggregated along both horizontal and vertical spatial directions to enhance the network’s sensitivity to orientation and position. Then, the nearest neighbor interpolation in the upsampling part is replaced by the CAR-EVC module, which predicts the upsampling kernel from the previous feature map and integrates semantic information into the feature map. Two residual structures are used to capture long-range semantic dependencies and improve feature representation capability. Finally, an efficient decoupled detection head is used to separate classification and regression tasks for better defect classification. To evaluate our model’s performance, we established a wafer surface defect dataset containing six typical defect categories. The experimental results show that the CCDe-YOLO model achieves 91.0% mAP@0.5 and 46.2% mAP@0.5:0.95, with precision of 89.5% and recall of 83.2%. Compared with the original YOLOv7 model and other object detection models, CC-De-YOLO performs better. Therefore, our proposed method meets the accuracy requirements for wafer surface defect detection and has broad application prospects. The dataset containing surface defect data on wafers is currently publicly available on GitHub (https://github.com/ztao3243/Wafer-Datas.git).
Read full abstract