Integrating Local and Global Features for Wafer Defect Pattern Classification via Sequential Hybrid Architecture

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Wafer map defect pattern classification supports quality monitoring in semiconductor manufacturing, but public benchmark datasets such as WM-811K exhibit extreme class imbalance, where majority classes can dominate standard metrics. This study aims to improve minority class performance while maintaining inference efficiency. Building on an iFormer-based hybrid backbone, we propose the Pattern-Selective Sequential Hybrid Network (PSS-HNet), which redesigns attention blocks to sequentially integrate local interaction (Modulated Convolution) and global interaction (Modulated Axial Attention) and applies sigmoid-based gating to control contextual information injection. Experiments on WM-811K (9 classes) compare iFormer (baseline), Axial-only, Axial+Modulation, and PSS-HNet using macro-averaged metrics as primary indicators, along with class-wise analysis and efficiency evaluation. PSS-HNet improves Macro-Recall by 1.02 percentage points (from 0.8852 to 0.8954) and Macro-F1 by 0.54 percentage points (from 0.9044 to 0.9098) over the baseline while maintaining similar accuracy. It also reduces computational cost and inference latency to 0.754 G FLOPs, 4.381 M parameters, and 7.682 ms, compared with 1.103 G FLOPs, 6.245 M parameters, and 8.666 ms for the baseline. Overall, selective sequential local–global integration provides a favorable balance between minority class performance and efficiency.

Similar Papers
  • Research Article
  • Cite Count Icon 16
  • 10.1016/j.eswa.2023.120765
A novel approach for wafer defect pattern classification based on topological data analysis
  • Nov 1, 2023
  • Expert Systems with Applications
  • Seungchan Ko + 1 more

A novel approach for wafer defect pattern classification based on topological data analysis

  • Research Article
  • Cite Count Icon 4
  • 10.1080/08982112.2023.2286502
Mixed-type defect pattern recognition in noisy labeled wafer bin maps
  • Dec 6, 2023
  • Quality Engineering
  • Sumin Kim + 1 more

In semiconductor manufacturing, classification of defect patterns in wafer bin maps (WBMs) helps engineers detect process failures and identify their causes. In recent studies on WBMs, convolutional neural networks (CNNs) have demonstrated effective classification performance based on their high expressive power. However, previous studies have implicitly assumed that the labels of WBMs used for training CNNs are correct, even though labels are often incorrect. When trained on mislabeled data, CNNs with standard cross-entropy loss can easily overfit mislabeled samples, leading to poor generalization for testing data. To overcome this issue, we propose a novel training algorithm called sample bootstrapping. Sample bootstrapping identifies which samples have clean or noisy labels by using a two-component beta mixture model, and measures the uncertainty of each identified label. Then, samples with low uncertainty of their estimated labels are selected to build mini-batches via weighted random sampling. Finally, CNNs are trained on the selected mini-batches with dynamic bootstrapping loss. In this manner, we can correct only the samples that are highly likely to have noisy labels and prevent the risk of false correction of correctly labeled samples. Experiments on simulated and real datasets demonstrate the effectiveness of the proposed method.

  • Book Chapter
  • Cite Count Icon 19
  • 10.1007/978-3-642-39479-9_45
Detection and Classification of Defect Patterns in Optical Inspection Using Support Vector Machines
  • Jan 1, 2013
  • Liangjun Xie + 2 more

Optical inspection techniques have been widely used in industry as they are non-destructive, efficient to achieve, easy to process, and can provide rich information on product quality. Defect patterns such as rings, semi-circles, scratches, clusters are the most common defects in the semiconductor industry. Most methods cannot identify two scale-variant or shift-variant or rotation-variant defect patterns, which in fact belong to the same failure causes. To address these problems, a new approach has been proposed in this paper to detect these defect patterns in noisy images obtained from printed circuit boards, wafers, and etc. A median filter, background removal, morphological operation, segmentation and labeling are employed in the detection stage of our method. Support vector machine (SVM) is used to identify the defect patterns which are resized. Classification results of both simulated data and real noisy raw data show the effectiveness of our method.

  • Research Article
  • Cite Count Icon 1
  • 10.1086/680581
Comment
  • Jan 1, 2015
  • NBER Macroeconomics Annual
  • Samuel Kortum + 1 more

Comment

  • Research Article
  • Cite Count Icon 7
  • 10.1109/tsm.2021.3131597
Semiconductor Defect Pattern Classification by Self-Proliferation-and-Attention Neural Network
  • Feb 1, 2022
  • IEEE Transactions on Semiconductor Manufacturing
  • Yuanfu Yang + 1 more

Semiconductor manufacturing is on the cusp of a revolution: the Internet of Things (IoT). With IoT we can connect all the equipment and feed information back to the factory so that quality issues can be detected. In this situation, more and more edge devices are used in wafer inspection equipment. This edge device must have the ability to quickly detect defects. Therefore, how to develop a high-efficiency architecture for automatic defect classification to be suitable for edge devices is the primary task. In this paper, we present a novel architecture that can perform defect classification in a more efficient way. The first function is self-proliferation, using a series of linear transformations to generate more feature maps at a cheaper cost. The second function is self-attention, capturing the long-range dependencies of feature map by the channel-wise and spatial-wise attention mechanism. We named this method as self-proliferation-and-attention neural network. This method has been successfully applied to various defect pattern classification tasks. Compared with other latest methods, SP&A-Net has higher accuracy and lower computation cost in many defect inspection tasks.

  • Conference Article
  • Cite Count Icon 63
  • 10.1109/asmc.2019.8791815
A Deep Learning Model for Identification of Defect Patterns in Semiconductor Wafer Map
  • May 1, 2019
  • Yang Yuan-Fu

The semiconductors are used as various precision components in many electronic products. Each layer must be inspected of defect after drawing and baking the mask pattern in wafer fabrication. Unfortunately, the defects come from various variations during the semiconductor manufacturing and cause massive losses to the companies' yield. If the defects could be identified and classified correctly, then the root of the fabrication problem can be recognized and eventually resolved. Automatic optical inspection (AOI) is used to visualize defect patterns and identify root causes of die failures. AOI can be replaced a large number of human inspections with high-speed and accurate inspection technology, to achieve consistency in the detection and shorten the inspection time, then improve product quality and competitiveness. The defect is judged from the feature in AOI, but the final goal is to determine if the defect is a true or a pseudo defect of the wafer. Then, we need to determine what defect type is. But the current AOI needs a subsequent final verification by the human to judge the type of defect. Machine learning (ML) techniques have been widely accepted and are well suited for such classification and identification problems. In this paper, we employ convolutional neural networks (CNN) and extreme gradient boosting (XGBoost) for wafer map retrieval tasks and the defect pattern classification. CNN is the most famous deep learning architecture. The recent surge of interest in CNN is due to the immense popularity and effectiveness of convnets. XGBoost is the most popular machine learning framework among data science practitioners, especially on Kaggle, which is a platform for data prediction competitions where researchers post their data and statisticians and data miners compete to produce the best models. CNN and XGBoost are compared with a random decision forests (RF), support vector machine (SVM), adaptive boosting (Adaboost), and the final results indicate a superior classification performance of the proposed method. Our experimental result demonstrates the success of CNN and extreme gradient boosting techniques for the identification of defect patterns in semiconductor wafers. The overall classification accuracy for the test dataset of CNN and extreme gradient boosting is 99.2%/98.1%. We demonstrate the success of this technique for the identification of defect patterns in semiconductor wafers. We believe this is the first time accurate computational classification in such task has been reported achieving accuracy above 99%.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.1109/access.2020.2991838
Local Heterogeneous Features for Person Re-Identification in Harsh Environments
  • Jan 1, 2020
  • IEEE Access
  • Haijia Zhang + 5 more

Local features could learn semantic information for pedestrian images and they are very important for person re-identification (Re-ID) in harsh environments. However, most approaches only optimize one kind of local feature, which results in incomplete local features. In this paper, we propose Local Heterogeneous Features (LHF) to extract discriminative local features from three aspects. To this end, we utilize three kinds of losses to learn three kinds of local features, i.e., local discriminative features, local relative features, local compact features. As for local discriminative features, we split the attention maps into three horizontal sub-regions and perform the classification operation. Then, we divide the attention maps into two horizontal sub-regions, and we synchronously apply the triplet loss and center loss to learn local relative features and local compact features. Finally, we utilize local discriminative features to represent pedestrian. We evaluate LHF on public person Re-ID datasets and prove LHF is meaningful for local feature learning.

  • Video Transcripts
  • 10.48448/es7a-qt84
GO FIGURE: A Meta Evaluation of Factuality in Summarization
  • Aug 1, 2021
  • Underline Science Inc.
  • Jianfeng Gao + 4 more

While neural language models can generate text with remarkable fluency and coherence, controlling for factual correctness in generation remains an open research question. This major discrepancy between the surface-level fluency and the content-level correctness of neural generation has motivated a new line of research that seeks automatic metrics for evaluating the factuality of machine text. In this paper, we introduce GO FIGURE, a meta-evaluation framework for evaluating factuality evaluation metrics. We propose five necessary conditions to evaluate factuality metrics on diagnostic factuality data across three different summarization tasks. Our benchmark analysis on ten factuality metrics reveals that our meta-evaluation framework provides a robust and efficient evaluation that is extensible to multiple types of factual consistency and standard generation metrics, including QA metrics. It also reveals that while QA metrics generally improve over standard metrics that measure factuality across domains, performance is highly dependent on the way in which questions are generated.

  • Research Article
  • Cite Count Icon 71
  • 10.1109/tsm.2020.3038165
Self-Supervised Representation Learning for Wafer Bin Map Defect Pattern Classification
  • Nov 16, 2020
  • IEEE Transactions on Semiconductor Manufacturing
  • Hyungu Kahng + 1 more

Automatic identification of defect patterns in wafer bin maps (WBMs) stands as a challenging problem for the semiconductor manufacturing industry. Deep convolutional neural networks have recently shown decent progress in learning spatial patterns in WBMs, but only at the expense of explicit manual supervision. Unfortunately, a clean set of labeled WBM samples is often limited in both size and quality, especially during rapid process development or early production stages. In this study, we propose a self-supervised learning framework that makes the most out of unlabeled data to learn beforehand rich visual representations for data-efficient WBM defect pattern classification. After self-supervised pre-training based on noise-contrastive estimation, the network is fine-tuned on the available labeled data to classify WBM defect patterns. We argue that self-supervised pre-training with a vast amount of unlabeled data substantially improves classification performance when labels are scarce. We demonstrate the effectiveness of our work on a real-world public WBM dataset, WM-811K. The code is available at https://github.com/hgkahng/WaPIRL.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/itc50671.2022.00006
Wafer Map Defect Classification Based on the Fusion of Pattern and Pixel Information
  • Sep 1, 2022
  • Yiwen Liao + 4 more

With the dramatically increasing requirements on semiconductor products, improving the yield is one of the major tasks for semiconductor manufacturers. To minimize losses, automatic and efficient wafer testing tools are required to quickly notify the engineers of potential problems. One such technique is wafer map defect pattern classification, which has inspired and motivated extensive research over the last decades. Many popular studies often design novel wafer map defect identification algorithms based on manual feature extraction, statistical learning and deep neural networks, having achieved significant advancement and success. However, these methods often face challenges of training large-scale networks and few of them have noticed the full usage of the information within each wafer map. Based on the concerns above, this paper proposes a multi-task learning framework based on neural networks that fuses the information of the entire wafer map as well as the state of each individual die to enhance the defect pattern classification capability. Extensive experiments on a public real-world dataset have been conducted to justify the effectiveness of our method. Specifically, our method achieved an classification accuracy of 96.3%, which was better or comparable to other state-of-the-art approaches that required notably larger network sizes and heavy data augmentation.

  • Conference Article
  • Cite Count Icon 2
  • 10.1145/3302425.3302427
Facial Expression Recognition using Local Directional Pattern variants and Deep Learning
  • Dec 21, 2018
  • Kennedy Chengeta + 1 more

Automated facial expressions has been used with success in medical, industrial security, gaming and aviation security as well as marketing systems. The study compares and analyses synergy of a Local Binary Pattern variant and Convolutional Neural Networks (CNNs / ConvNets) in facial expression recognition. Major emotional behavioural states include fear, anger, neutrality, happiness and sadness. Local Directional Patterns are used in facial edge detection on local features in grey scales. The study applies LDP feature extraction and uses deep learning CNN algorithms to recognise facial expressions of targeted facial databases. The study uses Convolutional Neural Networks (CNNs / ConvNets) on a dataset already trained by LDP Feature Extractor. Local Directional Pattern algorithm is based on edge detection Kirsh Algorithm. The CK+ and Googleset facial expression databases are used in this study. Convolutional Neural Networks used the extracted feature histograms for training. Performance accuracy is used as measure of the study. A hybrid of Local Directional Patterns, local binary pattern variants and an ensemble voting classifier gave an accuracy which was within one percentage point less than convolutional neural networks alone with very quick processing times of sub minute. A hybrid of feature extraction using LDP and deep learning CNN(LDGPNet) algorithm's accuracy was less than 1 percentage point better than convolutional neural networks alone albeit with quicker processing time. For modest and higher budgets, the study recommends LDGPNet using the Local Directional Pattern feature extractor, Gabor Filters and Convolutional Neural Networks. The implementation resulted in reduced processing time, improved edge detection and slightly higher accuracy to Convolutional Neural Networks. For less budgets, the study recommends the local directional pattern, local binary pattern and ensemble voting classifier hybrid oering fastest processing time, and slightly less accuracy times within 1 to 2 percentage points of convolutional neural networks and LDGBNet.

  • Research Article
  • Cite Count Icon 3
  • 10.1088/1361-6560/ad4d53
Automatic breast ultrasound (ABUS) tumor segmentation based on global and local feature fusion
  • May 30, 2024
  • Physics in Medicine & Biology
  • Yanfeng Li + 5 more

Accurate segmentation of tumor regions in automated breast ultrasound (ABUS) images is of paramount importance in computer-aided diagnosis system. However, the inherent diversity of tumors and the imaging interference pose great challenges to ABUS tumor segmentation. In this paper, we propose a global and local feature interaction model combined with graph fusion (GLGM), for 3D ABUS tumor segmentation. In GLGM, we construct a dual branch encoder-decoder, where both local and global features can be extracted. Besides, a global and local feature fusion module is designed, which employs the deepest semantic interaction to facilitate information exchange between local and global features. Additionally, to improve the segmentation performance for small tumors, a graph convolution-based shallow feature fusion module is designed. It exploits the shallow feature to enhance the feature expression of small tumors in both local and global domains. The proposed method is evaluated on a private ABUS dataset and a public ABUS dataset. For the private ABUS dataset, the small tumors (volume smaller than 1 cm3) account for over 50% of the entire dataset. Experimental results show that the proposed GLGM model outperforms several state-of-the-art segmentation models in 3D ABUS tumor segmentation, particularly in segmenting small tumors.

  • Research Article
  • Cite Count Icon 27
  • 10.1016/j.neunet.2021.07.018
A CNN model embedded with local feature knowledge and its application to time-varying signal classification
  • Jul 22, 2021
  • Neural Networks
  • Ruiping Yang + 3 more

A CNN model embedded with local feature knowledge and its application to time-varying signal classification

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 162
  • 10.1016/s2214-109x(15)00087-x
Length of secondary schooling and risk of HIV infection in Botswana: evidence from a natural experiment.
  • Jun 28, 2015
  • The Lancet Global Health
  • Jan-Walter De Neve + 4 more

BackgroundAn estimated 2·3 Million individuals are newly infected with HIV each year. Existing cross-sectional and longitudinal studies have found conflicting evidence on the association between education and HIV risk, and no randomized experiment to date has identified a causal effect of education on HIV incidence.MethodsA 1996 policy reform changed the grade structure of secondary school in Botswana and increased educational attainment. We use this reform as a ‘natural experiment’ to identify the causal effect of schooling on HIV infection. Data on HIV biomarkers and demographics were obtained from the 2004 and 2008 Botswana AIDS Impact Surveys, nationally-representative household surveys (N = 7018). The association between years of schooling and HIV status was described using multivariate OLS regression models. Using exposure to the policy reform as an instrumental variable, we estimated the causal effect of years of schooling on the cumulative probability that an individual contracted HIV up to his or her age at the time of the survey. The cost-effectiveness of secondary schooling as an HIV prevention intervention was assessed in comparison to other established interventions.FindingsEach additional year of secondary schooling induced by the policy change led to an absolute reduction in the cumulative risk of HIV infection of 8·1% points (p = 0·008), relative to a baseline prevalence of 25·6%. Effects were particularly large among women (11·6% points, p = 0·046). Results were robust to a wide array of sensitivity analyses. Secondary school was cost-effective as an HIV prevention intervention by standard metrics.InterpretationAdditional years of secondary schooling had a large protective effect against HIV risk, particularly for women, in Botswana. Increasing progression through secondary school may be a cost-effective HIV prevention measure in HIV-endemic settings, in addition to yielding other societal benefits.FundingTakemi Program in International Health at the Harvard School of Public Health, Belgian American Educational Foundation, and Fernand Lazard Foundation.

  • Research Article
  • Cite Count Icon 21
  • 10.1016/j.eswa.2023.122301
Semi-supervised imbalanced classification of wafer bin map defects using a Dual-Head CNN
  • Oct 31, 2023
  • Expert Systems With Applications
  • Siyamalan Manivannan

Semi-supervised imbalanced classification of wafer bin map defects using a Dual-Head CNN

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant