Abstract

Data-based soft sensors have been widely applied in industrial processes for enabling online prediction of difficult-to-measure variables. However, there exists a common phenomenon of “unlabeled data rich, but labeled data poor” in many practical processes, which has become the main bottleneck of developing high-performance data-based soft sensors. To address this issue, two novel semi-supervised soft sensor methods, namely evolutionary optimization based pseudo labeling method (EOPL) and ensemble EOPL method (EnEOPL), are proposed. The proposed methods first formulate the issue of pseudo labeling for unlabeled data as an optimization problem, where the labels of unlabeled data (denoting pseudo-labels) serve as the decision variables. Then, an evolutionary optimization approach is used to solve the optimization problem, which utilizes Gaussian process regression (GPR) as the base learner. Next, a new GPR model is built by the enlarged labeled training set which combines the labeled data and high-confidence pseudo-labeled data together. Furthermore, by exploiting ensemble learning framework, EOPL is extended to EnEOPL in order to enhance the prediction performance. Two case studies demonstrate that the proposed methods are superior to traditional pseudo-labeling style semi-supervised methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call