Abstract

Wastewater treatment processes are inherently dynamic due to the large variations in influent wastewater flow rate, weather conditions, concentration,and composition. The utilization of trace clustering techniques in process mining research field is an excellent way to analyze both the execution and confirm the compliance of wastewater treatment processes. However, much of existing trace clustering research has been focused on applying activity names to assist process scenarios discovery without considering other information in event logs. In addition, many existing algorithms commonly used in the literature, such as k-means clustering approach, require prior knowledge about the number of process scenarios existed in the log, which sometimes are not known aprior. This paper presents an approach that uses timing information to assist in discovering process scenarios from event logs in wastewater treatment processes without requiring any prior knowledge about process scenarios. A real wastewater treatment process provided by a domain expert is used as a case study to investigate the effectiveness and validity of the approach. We also use five real-life event logs to compare the performance of the proposed approach for process scenario discoveries with the commonly used k-means clustering approach in terms of model’s harmonic mean of the weighted average fitness and precision, i.e., the F1 score. The experiment data shows that (1) the proposed approach is able to discover the process scenarios from event logs in wastewater treatment domain; (2) the process scenario models obtained with the additional timing information have both higher fitness and precision scores than the models obtained without the timing information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call