A quality detection method of the unbalanced data based on the non‐parameter Log–Log prediction model with the feature extraction

Shuying Wang,Jia Chen,Chunjie Wang,Bo Zhao

doi:10.1002/mma.9835

Abstract

In quality detection, it is important to classify and predict the unbalanced data sets with a high proportion of qualified and unqualified products. There exist already some machine learning methods available. However, these existing methods assume that the samples are evenly distributed among the different classes and ignore the unbalanced characteristics of data. In addition, existing methods cannot be directly applied to high‐dimensional data and cannot accurately express the relationship between data features and the quality of industrial engineering products. In this paper, we propose a new quality detection method of the unbalanced data by establishing a non‐parameter Log–Log classification model. The principal component analysis (PCA) is used to extract the features and reduce the dimension of the original data sets. We develop a sieve maximum likelihood algorithm to obtain the non‐parameter function classifier. The proposed method is applied to the product quality detection of industrial semiconductor manufacturing. The results show the proposed method has high detection performance and classification ability. Compared with traditional machine learning methods, the proposed method has a higher classification accuracy can better describe the relationship between product characteristics and product quality and has a strong generalization ability for different data sets.

Full Text