Frequent pattern mining-based log file partition for process mining

László Bántay,János Abonyi

doi:10.1016/j.engappai.2023.106221

László Bántay, János Abonyi

Open Access

https://doi.org/10.1016/j.engappai.2023.106221

Copy DOI

Abstract

Process mining is a technique for exploring models based on event sequences, growing in popularity in the process industry. Process mining algorithms assume that the processed log files contain events generated by only one unknown process, which can lead to extremely complex and inaccurate models when this assumption is not met. To address this issue, this article proposes a frequent pattern mining-based method for log file partitioning, allowing for the exploration of parallel processes. The key idea is that frequent pattern mining can identify grouped events and generate sub-logs of overlapping sub-processes. Thanks to the pre-processing of the log files, more compact and interpretable process models can be identified. We developed a set of goal-oriented metrics to evaluate the complexity of process mining problems and the resulting models. The applicability and effectiveness of the method are demonstrated in the analysis of process alarms of an industrial plant. The results confirm that the proposed method enables the discovery of targeted sub-process models by partitioning the log file using frequent pattern mining, and the effectiveness of the method increases with the number of parallel processes stored in the same log file. We recommend applying the method in every case where there is no clear start and end of the logged events so that the log file can describe different processes.

Full Text