Operational data from industrial processes typically consist of continuous and categorical variables. They reveal different aspects of the operational conditions, and both are useful for comprehensive fault detection and diagnosis (FDD). However, due to the multiple production modes, the variables usually do not follow Gaussian or Bernoulli distributions. Furthermore, their correlations can be different across normal and various faulty classes. Thus, the main challenge lies in how to properly fuse the complementary information in the two types of variables and accurately characterize the complicated distribution for each class. This work proposes a flexible probabilistic framework that can concurrently analyze continuous and categorical variables for FDD. Our framework specifies a finite mixture model for each class. Thus, it can handle non-Gaussian and non-Bernoulli variables and capture their correlations under the conditional independence assumption. We then introduce the variational inference for parameter estimation, which makes our framework adaptive to the different distributions of various classes. Furthermore, a unified statistical index is designed for two types of variables, giving our method extra capability to distinguish unknown faults. Finally, the effectiveness of our method is validated on the Tennessee Eastman (TE) process and a practical industrial plant process. The average <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">F1</i> score of our method is improved by 3.4/3.9 percentage points in the single-mode/multimode situation on the TE process compared with traditional mixture discriminant analysis. In the industrial plant process, when unknown faults are added, the average <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">F1</i> score of our method is 91.1% and only 1.2 percentage points lower than that on the test set without unknown faults.
Read full abstract