In process mining, an event log is a structured collection of recorded events that describes the execution of processes within an organization. The completeness of event logs is crucial for ensuring accurate and reliable process models. Incomplete event logs, which can result from system errors, manual data entry mistakes, or irregular operational patterns, undermine the integrity of these models. Addressing this issue is essential for constructing accurate models. This research aims to enhance process model performance and robustness by transforming incomplete event logs into complete ones using a process discovery algorithm. Genetic process mining, a type of process discovery algorithm, is chosen for its ability to evaluate multiple candidate solutions concurrently, effectively recovering missing events and improving log completeness. However, the original form of the genetic process mining algorithm is not optimized for handling incomplete logs, which can result in incorrect models being discovered. To address this limitation, this research proposes a modified approach that incorporates timing information to better manage incomplete logs. By leveraging timing data, the algorithm can infer missing events, leading to process tracking and reconstruction which is more accurate. Experimental results validate the effectiveness of the modified algorithm, showing higher fitness and precision scores, improved process model comparisons, and a good level of coverage without errors. Additionally, several advanced metrics for conformance checking are presented to further validate the process models and event logs discovered by both algorithms.
Read full abstract