Abstract

Machine learning has been applied successfully for faulty wafer detection tasks in semiconductor manufacturing. For the tasks, prediction models are built with prior data to predict the quality of future wafers as a function of their precedent process parameters and measurements. In real-world problems, it is common for the data to have a portion of input variables that are irrelevant to the prediction of an output variable. The inclusion of many irrelevant variables negatively affects the performance of prediction models. Typically, prediction models learned by different learning algorithms exhibit different sensitivities with regard to irrelevant variables. Algorithms with low sensitivities are preferred as a first trial for building prediction models, whereas a variable selection procedure is necessarily considered for highly sensitive algorithms. In this study, we investigate the effect of irrelevant variables on three well-known representative learning algorithms that can be applied to both classification and regression tasks: artificial neural network, decision tree (DT), and k-nearest neighbors (k-NN). We analyze the characteristics of these learning algorithms in the presence of irrelevant variables with different model complexity settings. An empirical analysis is performed using real-world datasets collected from a semiconductor manufacturer to examine how the number of irrelevant variables affects the behavior of prediction models trained with different learning algorithms and model complexity settings. The results indicate that the prediction accuracy of k-NN is highly degraded, whereas DT demonstrates the highest robustness in the presence of many irrelevant variables. In addition, a higher model complexity of learning algorithms leads to a higher sensitivity to irrelevant variables.

Highlights

  • In the semiconductor manufacturing process, the quality of wafers is affected by various of internal and external factors [1,2]

  • We evaluate the effect of the number of irrelevant variables on different learning algorithms, each of which builds a prediction model in a different manner based on its own competence

  • On the other hand, applying variable selection is essential for prediction models that are sensitive to irrelevant variables, for the k-nearest neighbors (k-NN) with a high model complexity, which is very sensitive to irrelevant variables

Read more

Summary

Introduction

In the semiconductor manufacturing process, the quality of wafers is affected by various of internal and external factors [1,2]. A learning algorithm A is employed to build a function that best approximates the true function f , called a prediction model fA , from a set of given instances called a training dataset D = {(xi , yi )}in=1 , where xi is a vector of input variables, yi is the corresponding value of a output variable, and n is the number of training instances. By applying a variable selection procedure, we can obtain more accurate and concise prediction models that include a reduced number of input variables. When computational time and resources are limited, a single learning algorithm that is less sensitive to irrelevant variables is preferable [9] This provides good prediction models that can be used as is without requiring variable selection.

Related Work
Learning with Irrelevant Variables
Empirical Analysis
Problem Description
Experimental Design
Results and Discussion
Concluding Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call