Abstract

Chemical process modeling is the basis for research and applications in related fields. With the development of industrial informatization, data-driven process modeling techniques are increasingly applied in chemical processes, helping to obtain more accurate results with less model development costs. However, due to the high-dimensional nonlinear characteristics of most chemical processes, problems such as the “curse of dimensionality” and information redundancy will render the models more prone to overfitting with reduced accuracies and weaker generalization abilities. Many data dimensionality reduction methods are adopted to mitigate the above problems, but most are limited by inaccurate association measurements and weak redundancy exclusion. In this paper, the extensive existence of data associations and information redundancies is first revealed through the analysis from an information-theoretic perspective. Then, a feature selection method based on conditional refined maximal information coefficient maximization (CRMICM) is proposed to improve the consistency of association measurement and the accuracy of redundancy estimation with limited samples. The final prediction modeling test for an actual fluidized catalytic cracking (FCC) process proves the extensive association between the variables and targets. Only a few variables are essential for the modeling, while the rest are redundant. Compared with other methods, CRMICM achieves the best dimensionality reduction effects on the FCC process data regarding the number of features and model prediction accuracy, showing its good applicability for chemical processes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call