Abstract

Missing data is quite common in the industrial field, resulting in problems in downstream applications, as most data driven methods used in these applications rely on complete and high-quality dataset to build a high-quality model. Existing methods deal with missing data individually regardless of its downstream application, treating all variables equally without considering their different roles in the downstream application. This would affect imputation performance for key variables, thus deteriorating the accuracy of the downstream model. A considerable challenge is how to refine the missing data imputation task. In this paper, a new method termed fine-tuned imputation GAN (FIGAN) is designed to achieve customized data imputation for industrial soft sensor. The major contribution of the paper lies in two aspects: 1) different from the original imputation GAN (GAIN) which treats all variables equally, FIGAN is guided by a soft sensor module so as to achieve customized data imputation by performing improved data imputation on quality-related variables. Enhanced accuracy for the final industrial soft sensor would be possible; 2) in addition, since labels of the soft sensor might also have missing data, a soft sensor with pseudo labeling is designed to conquer the problem with data imputation and label prediction being optimized interactively. Case studies on a converter steelmaking process and a penicillin fermentation process show the feasibility of the proposed FIGAN. It is noted that such customized imputation could be readily transferred to other downstream applications with missing data. Note to Practitioners—Industrial data is often incomplete and needs proper treatment. Meanwhile, downstream applications with preprocessed data vary under different industrial scenes. The focus of this study is to develop a customized data imputation method for specific downstream applications such as soft sensing. A fine-tuned imputation GAN is designed with a soft sensor module so as to guide the data imputation for good imputation on key variables of the soft sensor. Considering that missing data exists not only in measurement variables but also in soft sensor labels, a semi-supervised soft sensor is designed to handle missing data in the labels, optimized together with the imputation model. The customized data imputation can thus improve the final performance of the downstream model which is a soft sensor in this work. The customization could be transferred to other applications such as an anomaly detection model as well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call