A Multi-Scale Feature Fusion Network for Chip Surface Defect Detection
Chip surface defect detection plays a crucial role in semiconductor manufacturing and the electronics industry. However, chip surface defects have relatively tiny defect areas and complex defect features make the chip surface defect detection task low in accuracy. In this paper, we propose a multi-scale feature fusion network based on encoder-decoder architecture to solve this challenge. Particularly, we propose a Residual Feature Fusion Attention Module (RFFAM) and an Intersection over Union Loss for Defect with Auxiliary Bounding Box (DA-IoU), and we incorporate a Transformer Prediction Head (TPH) for the network. Additionally, in order to solve the problem of the shortage of chip surface defect datasets, this paper proposes a chip surface defect dataset containing four defect categories. The experimental results show that the method not only improves the detection accuracy but also maintains a small number of parameters, satisfying the engineering requirements of chip surface defect detection.
- Research Article
20
- 10.3390/rs13020328
- Jan 19, 2021
- Remote Sensing
The classification of high-resolution (HR) synthetic aperture radar (SAR) images is of great importance for SAR scene interpretation and application. However, the presence of intricate spatial structural patterns and complex statistical nature makes SAR image classification a challenging task, especially in the case of limited labeled SAR data. This paper proposes a novel HR SAR image classification method, using a multi-scale deep feature fusion network and covariance pooling manifold network (MFFN-CPMN). MFFN-CPMN combines the advantages of local spatial features and global statistical properties and considers the multi-feature information fusion of SAR images in representation learning. First, we propose a Gabor-filtering-based multi-scale feature fusion network (MFFN) to capture the spatial pattern and get the discriminative features of SAR images. The MFFN belongs to a deep convolutional neural network (CNN). To make full use of a large amount of unlabeled data, the weights of each layer of MFFN are optimized by unsupervised denoising dual-sparse encoder. Moreover, the feature fusion strategy in MFFN can effectively exploit the complementary information between different levels and different scales. Second, we utilize a covariance pooling manifold network to extract further the global second-order statistics of SAR images over the fusional feature maps. Finally, the obtained covariance descriptor is more distinct for various land covers. Experimental results on four HR SAR images demonstrate the effectiveness of the proposed method and achieve promising results over other related algorithms.
- Research Article
71
- 10.1109/access.2022.3227205
- Jan 1, 2022
- IEEE Access
The detection of defects is indispensable in industrial production. Surface defects have different scales. Both minimal flaws and significant scratches may appear on the same product. The standard method uses a multi-scale feature fusion network, introducing many parameters that may reduce the inference speed. In actual industrial production scenarios, inference speed and accuracy play an equally important role. Therefore we propose an algorithm to effectively improve the detection speed while improving the detection accuracy. The model proposed in this paper called “YOLO with lightweight feature fusion network(LFF-YOLO).” First, we use ShuffleNetv2 as a feature extraction network to reduce the number of parameters. Then, to improve the efficiency of multi-scale feature fusion, we propose the lightweight feature pyramid network (LFPN). Considering that the fixed receptive field is difficult to adapt to the defects of different scales, it may lead to the difficulty of model convergence and seriously affect the detection performance. Therefore, we propose the adaptive receptive field feature extraction (ARFFE) module, which weights the multi-receptive field channels to generate multi-receptive field information. In addition, focal loss is used to solve the problem of imbalance between positive and negative samples. Finally, we conducted experiments on NEU-DET (79.23% mAP), Peking University printed circuit board defect dataset (93.31% mAP),and GC10-DET (59.78% mAP), respectively. Extensive experiments show that our proposed method achieves optimal detection speed compared with the prevailing methods, and the detection accuracy of our method is also highly competitive.
- Research Article
- 10.1088/2631-8695/ae344d
- Feb 1, 2026
- Engineering Research Express
Precise defect detection of aluminum-plastic blister packaging before drug boxing has become a critical step in pharmaceutical production. To address the common challenges in current algorithms, such as the difficulty in identifying small and complex defect features and low detection accuracy, while ensuring real-time performance and efficiency, we propose a computer vision-based multi-scale dual-stream feature fusion neural network, named M-DSFBNet. First, we introduce the multi-scale frequency-aware convolutional module, which enhances feature representation capabilities through a split-perception-selection strategy to improve network performance, while enabling lightweight network processing. Additionally, we design a neck fusion network that improves the aggregation and fusion of multi-scale feature information via the MSTB (multi-scale feature transformation block) with dual-path inputs. We collected defect samples and constructed the ABP-DET dataset for aluminum-plastic blister drug packaging surface defects, aimed at supporting the training and evaluation of detection models. Experimental results on the ABP-DET dataset show that M-DSFBNet achieves an average precision (mAP50) of 94.1%, while maintaining low computational complexity and parameter count, demonstrating a good balance between detection accuracy and computational efficiency.In addition, our model achieved an mAP50 of 79.9% on the NEU-DET dataset, demonstrating the strong generalization capability of M-DSFBNet.
- Research Article
1
- 10.1088/2631-8695/adf59b
- Aug 11, 2025
- Engineering Research Express
This study presents a lightweight multi-scale self-calibrating feature fusion network (YSCANet) to address critical challenges in wafer defect detection, including low recognition accuracy for morphologically similar defects, large inter-class scale variations, and severe category imbalance. The proposed framework integrates three key innovations: First, the self-calibrating feature fusion block (SCFFB) to enhance multiscale defect feature representation through cross-scale channel recalibration. Second, the Context Anchor Attention (CAA) mechanism optimizes spatial-semantic correlations via anchor point weighting, effectively resolving feature ambiguity in tiny defects. Third, dynamic sample reweighting via Focal Loss to mitigate class imbalance effects. Experimental validation on the WM-811K dataset demonstrates that YSCANet achieves a state-of-the-art average classification accuracy of 97.72% (0.76% higher than existing state-of-the-art methods), while reducing defect leakage rates by approximately 50%. The network exhibits exceptional computational efficiency with an inference speed of 151.28 FPS and a compact parameter size of 2.07M. Comparative analysis shows that YSCANet has superior performance among similar defect detection solutions. These advances make YSCANet a powerful solution for real-time, high-precision wafer detection in semiconductor manufacturing, while balancing accuracy, efficiency and hardware deployment constraints.
- Research Article
2
- 10.1088/1361-6501/adb204
- Feb 17, 2025
- Measurement Science and Technology
Permanent magnet motors may suffer from imperceptible localized demagnetization due to surface damage during operation. This paper proposes a lightweight cross-scale decoupling feature fusion network (LCDFFN) for permanent magnets defect detection, which introduces separable convolution in the backbone and incorporates the adaptive multi-scale feature fusion with interactive attention mechanism to enhance small target detection. The adaptive attention module (AAM) and feature enhancement module based on dilated convolution are introduced in the feature fusion network, improving multi-scale object detection. Finally, the decoupled feature prediction network outputs the defect identification feature map. In addition, we also propose a novel fuzzy intersection over the union loss function. LCDFFN achieves a mAP@0.5 of 97.8%, with 108.4M parameters and a detection speed of 113.62 FPS. The proposed method significantly improves defect image detection for permanent magnets and is highly practical for industrial production.
- Research Article
- 10.1038/s41598-026-35913-8
- Jan 16, 2026
- Scientific reports
Surface defect detection on steel components is crucial for quality control in polysilicon production. However, this task remains challenging due to tiny defect sizes, irregular geometries, complex backgrounds, and low contrast. To address these issues, we propose MSEOD-DDFusionNet (Multi-Scale and Effective Object-Detection Diffusion Fusion Network), a novel multi-scale diffusion-enhanced attention network. The network integrates four specialized modules: MTECAAttention (Multi-Scale Texture Enhancement Channel-Aware Attention) for lossless multi-scale feature fusion, ODConv (Omni-Dimensional Dynamic Convolution) for dynamic adaptation to irregular geometries, LMDP (Local Multi-Scale Discriminative Perception) for selective noise suppression and micro-defect amplification, and DDFusion (Diffusion-Driven Feature Fusion) for scene-aware noise modeling. Pruning further reduces computational complexity while improving accuracy. Extensive experiments on the specialized DDTE dataset and public benchmarks demonstrate state-of-the-art performance. Our model achieves 82.6% [Formula: see text] and 61.6% [Formula: see text] on DDTE, while maintaining a high inference speed of 193.5 FPS with only 8.46M parameters. It also shows excellent generalization across NEU-DET, GC10-DET, and cross-domain tasks, providing an efficient and accurate solution for industrial defect inspection.
- Research Article
2
- 10.1007/s40747-024-01699-3
- Dec 2, 2024
- Complex & Intelligent Systems
In order to address challenges such as small target sizes, low contrast, significant intra-class variations, and indistinct inter-class differences in surface defect detection, this paper proposes the Enhanced Context-aware Parallel Fusion Network (EC-PFN). The network employs a Featur Weave Network architecture to enhance contextal awareess and parallel fusion capabilities. It utilizes a Feature Fusion Module (UniFusionLayer) for effective multiscale and multisemantic feature learning, offering new perspectives on feature fusion. Additionally, a Receptive Field Block (RFB) module is introduced to expand the receptive field, enhancing feature extraction in scenarios with low contrast and subtle defects. The Loss Ranking Module (LRM) is incorporated to optimize the target-oriented loss, improving performance by omitting low-confidence bounding boxes. Extensive experiments on a light guide plate defect dataset demonstrate that EC-PFN achieves a detection accuracy (mAP) of 98.9%, a detection speed of 92 FPS, and a computational cost of 14.5 GFLOPs, outperforming mainstream surface defect detection models.
- Research Article
2
- 10.3390/electronics14071422
- Apr 1, 2025
- Electronics
Small defects on the surface of copper strips have a significant impact on key properties such as electrical conductivity and corrosion resistance, and existing inspection techniques struggle to meet the demand in terms of accuracy and generalisability. Although there have been some studies on metal surface defect detection, there is a relative lack of research on highly reflective copper strips. In this paper, a lightweight and efficient copper strip defect detection algorithm, SC-AttentiveNet, is proposed, aiming to solve the problems of the large model size, slow speed, insufficient accuracy and poor generalisability of existing models. The algorithm is based on ConvNeXt V2, and combines the SCDown module and group normalisation to design the SCGNNet feature extraction network, which significantly reduces the computational overhead while maintaining excellent feature extraction capability. In addition, the algorithm introduces the SPPF-PSA module to enhance the multi-scale feature extraction capability, and constructs a new neck feature fusion network via the HD-CF Fusion Block module, which further enhances the feature diversity and fine granularity. The experimental results show that SC-AttentiveNet has a mAP of 90.11% and 64.14% on the KUST-DET and VOC datasets, respectively, with a parameter count of only 6.365 MB and a computational complexity of 14.442 GFLOPs. Tests on the NEU-DET dataset show that the algorithm has an excellent generalisation performance, with a mAP of 76.41% and a detection speed of 78 FPS, demonstrating a wide range of practical application potential.
- Research Article
8
- 10.3934/mbe.2022408
- Jan 1, 2022
- Mathematical Biosciences and Engineering
The automatic surface defect detection system supports the real-time surface defect detection by reducing the information and high-lighting the critical defect regions for high level image under-standing. However, the defects exhibit low contrast, different textures and geometric structures, and several defects making the surface defect detection more difficult. In this paper, a pixel-wise detection framework based on convolutional neural network (CNN) for strip steel surface defect detection is proposed. First we extract the salient features by a pre-trained backbone network. Secondly, contextual weighting module, with different convolutional kernels, is used to extract multi-scale context features to achieve overall defect perception. Finally, the cross integrate is employed to make the full use of these context information and decoded the information to realize feature information complementation. The experimental results of this study demonstrate that the proposed method outperforms against the previous state-of-the-art methods on strip steel surface defect dataset (MAE: 0.0396; Fβ: 0.8485).
- Research Article
- 10.3390/ma18153646
- Aug 3, 2025
- Materials
Accurate weld defect detection is critical for ensuring structural safety and evaluating welding quality in industrial applications. Manual inspection methods have inherent limitations, including inefficiency and inadequate sensitivity to subtle defects. Existing detection models, primarily designed for natural images, struggle to adapt to the characteristic challenges of weld X-ray images, such as high noise, low contrast, and inter-defect similarity, particularly leading to missed detections and false positives for small defects. To address these challenges, a multi-dimensional feature fusion model (MADet), which is a multi-branch deep fusion network for weld defect detection, was proposed. The framework incorporates two key innovations: (1) A multi-scale feature fusion network integrated with lightweight attention residual modules to enhance the perception of fine-grained defect features by leveraging low-level texture information. (2) An anchor-based feature-selective detection head was used to improve the discrimination and localization accuracy for five typical defect categories. Extensive experiments on both public and proprietary weld defect datasets demonstrated that MADet achieved significant improvements over the state-of-the-art YOLO variants. Specifically, it surpassed the suboptimal model by 7.41% in mAP@0.5, indicating strong industrial applicability.
- Research Article
25
- 10.3390/drones8050186
- May 8, 2024
- Drones
Unmanned aerial vehicles (UAVs) are now widely used in many fields. Due to the randomness of UAV flight height and shooting angle, UAV images usually have the following characteristics: many small objects, large changes in object scale, and complex background. Therefore, object detection in UAV aerial images is a very challenging task. To address the challenges posed by these characteristics, this paper proposes a novel UAV image object detection method based on global feature aggregation and context feature extraction named the multi-scale feature information extraction and fusion network (MFEFNet). Specifically, first of all, to extract the feature information of objects more effectively from complex backgrounds, we propose an efficient spatial information extraction (SIEM) module, which combines residual connection to build long-distance feature dependencies and effectively extracts the most useful feature information by building contextual feature relations around objects. Secondly, to improve the feature fusion efficiency and reduce the burden brought by redundant feature fusion networks, we propose a global aggregation progressive feature fusion network (GAFN). This network adopts a three-level adaptive feature fusion method, which can adaptively fuse multi-scale features according to the importance of different feature layers and reduce unnecessary intermediate redundant features by utilizing the adaptive feature fusion module (AFFM). Furthermore, we use the MPDIoU loss function as the bounding-box regression loss function, which not only enhances model robustness to noise but also simplifies the calculation process and improves the final detection efficiency. Finally, the proposed MFEFNet was tested on VisDrone and UAVDT datasets, and the mAP0.5 value increased by 2.7% and 2.2%, respectively.
- Research Article
8
- 10.3390/app112210508
- Nov 9, 2021
- Applied Sciences
Surface defect detection of an automobile wheel hub is important to the automobile industry because these defects directly affect the safety and appearance of automobiles. At present, surface defect detection networks based on convolutional neural network use many pooling layers when extracting features, reducing the spatial resolution of features and preventing the accurate detection of the boundary of defects. On the basis of DeepLab v3+, we propose a semantic segmentation network for the surface defect detection of an automobile wheel hub. To solve the gridding effect of atrous convolution, the high-resolution network (HRNet) is used as the backbone network to extract high-resolution features, and the multi-scale features extracted by the Atrous Spatial Pyramid Pooling (ASPP) of DeepLab v3+ are superimposed. On the basis of the optical flow, we decouple the body and edge features of the defects to accurately detect the boundary of defects. Furthermore, in the upsampling process, a decoder can accurately obtain detection results by fusing the body, edge, and multi-scale features. We use supervised training to optimize these features. Experimental results on four defect datasets (i.e., wheels, magnetic tiles, fabrics, and welds) show that the proposed network has better F1 score, average precision, and intersection over union than SegNet, Unet, and DeepLab v3+, proving that the proposed network is effective for different defect detection scenarios.
- Research Article
2
- 10.1088/1361-6501/adb32a
- Feb 17, 2025
- Measurement Science and Technology
Surface defect detection in industrial manufacturing ensures product quality and prevents malfunctions. To address issues such as multi-scale damage, low contrast, and small defects on the surfaces of industrial components, we propose an efficient multi-scale feature enhancement network for improving the detection performance of industrial surface defects. First, a multi-scale extraction module is proposed to extract defect features at multiple levels to ensure sufficient semantic information for multi-scale damage and enhance the feature extraction ability of defects with different scales. Dual-orientation attention is then introduced into the detection network to establish a connection between spatial and channel dimensional information, which enables the network to focus on defect regions and filter out background noise. This alleviates the problems of low contrast and small defects. The experimental results confirm that the proposed network demonstrates superior detection performance compared to other detection algorithms across five surface defect datasets. Additionally, the parameters are reduced by 7.9%, the floating-point operations decrease by 6.7%, and the model size is reduced by 5.2%. These improvements collectively provide an efficient solution for industrial surface defect detection.
- Research Article
32
- 10.1109/tim.2021.3096284
- Jan 1, 2021
- IEEE Transactions on Instrumentation and Measurement
Surface defect detection is a challenging task in industrial manufacture. Recent methods using supervised learning need a large-scale dataset to achieve precise detection. However, the time-consuming and the difficulty of data acquisition make it difficult to build a large-scale dataset. This article proposes a domain adaptive network, called multiscale adversarial and weighted gradient domain adaptive network (MWDAN) for data scarcity surface defect detection. By MWDAN, the detection model trained from a small-scale dataset can gain the knowledge of transfer from another large-scale dataset, that is to say, even for a training dataset that is difficult to collect huge amounts of data, a good defect detection model can also be constructed, aided by another dataset that is relatively easy to acquire. The MWDAN is constructed in two levels. In the image level, a multiscale domain feature adaptation approach is proposed to solve the domain shift between the source domain and the target domain. In the instance level, a piecewise weighted gradient reversal layer (PWGRL) is designed to balance the weight of the backpropagation gradient for the hard- and easy-confused samples in domain classification and force confusion. Then, the PWGRL can reduce the local instance difference to further promote domain consistency. The experiments on mental surface defect detection show encourage results by the proposed MWDAN method.
- Research Article
97
- 10.1109/tii.2021.3085848
- Mar 1, 2022
- IEEE Transactions on Industrial Informatics
Rail surface defect inspection based on machine vision faces challenges against the complex background with interference and severe data imbalance. To meet these challenges, in this article, we regard defect detection as a key-point estimation problem and present the proposed attention neural network for rail surface defect detection via consistency of Intersection-over-Union(IoU)-guided center-point estimation (CCEANN). The CCEANN contains two crucial components. The two components are the stacked attention Hourglass backbone via cross-stage fusion of multiscale features (CSFA-Hourglass) and the CASIoU-guided center-point estimation head module (CASIoU-CEHM). Furthermore, the CASIoU-guided center-point estimation head module integrating the delicate coordinate compensation mechanism regresses detection boxes flexibly to adapt to defects' large-scale variation, in which the proposed CASIoU loss, a loss regressing the consistency of intersection-over-union (IoU), central-point distance, area ratio, and scale ratio between the targeted defect and the predicted defect, achieves higher regression accuracy than state-of-the-art IoU-based losses. The experiments demonstrate that the CCEANN outperforms competitive deep learning-based methods in four surface defect datasets.