Automated process discovery as one of the paradigms of process mining has attracted both industries and academic researchers. These methods offer visibility and comprehension out of complex and unstructured event logs. Over the past decade, the classic heuristic miner and applied heuristic-based process discovery algorithms showed promising results in revealing the hidden process patterns in information systems. One of the challenges related to such algorithms is the arbitrary selection of recorded behaviors in an event log. The offered filtering thresholds are manually adjustable, which could lead to the extraction of a non-optimal process model. This is also visible in commercial process mining solutions. Recently, the first version of the stable heuristic miner algorithm targeted this issue by evaluating the statistical stability of an event log. However, the previous version was limited to evaluating only activities’ behaviors. In this article, we’ll be evaluating the statistical stability of both activities and edges of a graph, which could be discovered from an event log. As a contribution, the stable heuristic miner 2 is introduced. Consequently, the definition of the descriptive reference process model has improved. The novel algorithm is evaluated by using two real-world event logs. These event logs are the familiar Sepsis data set and the urology department patients’ pathways event log, which is recorded by monitoring the interpreted location data of patients on hospital premises and is shared with the scientific community in this article.
Read full abstract