A data mining based approach for process identification using historical data

Ridouane Oulhiq,Khalid Benjelloun,Yassine Kali,Maarouf Saad

doi:10.1080/02286203.2021.1905375

Abstract

ABSTRACT In this paper, a data mining based methodology for process identification from historical data was proposed. Thereon, it considers the phases of process understanding, data collection, data preparation, data modeling, and model evaluation. As some parts of historical data are irrelevant, a data selection step, based on the Gaussian Mixture Model (GMM) clustering algorithm, was considered. Additionally, the methodology includes a data informativity step to study the richness of data. In this regard, the condition number (CN) and the extended CN for ridge regression (RR CN) were used. To evaluate the approach, 2 years of industrial thickener historical data were used. Thereafter, data were prepared and an ARX (Auto-Regressive with eXogenous inputs) model structure was adopted to identify the model. To estimate input delays, Granger causality was used. As for fit criteria, least square regression was tested and compared to ridge regression as a less sensitive method to multicollinearity. The results were then evaluated based on the 20-step ahead prediction and compared to existing methods. In this context, the proposed approach gave the best results with an R 2 of 98.11% and 62.70% for 1 and 20-step ahead predictions, respectively.

Full Text