Abstract Industrial process equipment is bulky and complex in structure, which is easy to produce faults during operation and affect production efficiency, cause huge economic losses, and even threaten the safety of workers. To achieve sustainable operation of large-scale industrial processes, timely and accurate monitoring and handling of abnormal situations are essential. However, fault monitoring of large equipment requires the collection of abundant data, which includes many complex related variables, resulting in excessive redundant data generated during the fault monitoring process. Moreover, the existing principal component analysis (PCA) method can only retain the global characteristic of variance information, and cannot obtain the local characteristic that can characterize the topological relationship between the data points, which affects its monitoring reliability and intelligence level. In response to these issues, a fault diagnosis model for complex industrial processes based on chunked statistical global-local feature fusion (CSGLFF) is proposed in this paper. First, considering the correlation characteristics between industrial process variables, a correlation variable chunking method mutual information-based is designed to merge the variables with small correlation to obtain the optimal chunking of variables. Second, PCA and locality preserving projections (LPP) are combined to construct a global-local feature fusion (GLFF) model that can extract global and local features simultaneously. The chunked data are imported into the GLFF for the extraction of its features respectively, and the corresponding CSGLFF is established. In addition, Bayesian inference is used to fuse the statistics of each sub-chunk to establish an overall fault monitoring statistical indicators, and the reason for failure is found through the variable contribution graph. Finally, two cases of Tennessee Eastman process (TEP) and laboratory emulsion pump were used to conduct experimental research on the performance of CSGLFF. The results show that compared with the chunked statistical PCA, chunked statistical LPP, and GLFF algorithms, The accuracy of fault monitoring for TEP mean, flow pulsation impact, and pressure anomaly of this method reached 92.91%, 97%, and 90.30%, respectively. It has good monitoring effect in processing data with large variables, reducing the generation of redundant data, improving the accuracy of industrial monitoring, and accurately identifying the relevant variables of fault occurrence. This provides a theoretical basis for determining the fault location and points out the direction for maintenance by staff.
Read full abstract