Process mining methods have been proven effective in turning historical log data into actionable process knowledge. However, most of them work under the assumption that the events reported in the logs can be easily mapped to well-defined process activities, that are the terms in which analysts are used to reason on the processes’ behaviors. We here consider the challenging scenario where this assumption does not hold: the log traces are sequences of low-level operations with no explicit reference to the corresponding high-level process activities. In this setting, we face the fundamental problem of bringing the log traces to the abstraction level of the analyst’s perspective. Formally, given a trace Φ, and on the basis of a high-level behavioral description of the processes, we search for every possible interpretation⟨σ, W⟩ of Φ, where σ is a sequence of high-level activities whose execution may have generated the sequence of low-level operations Φ, and, in turn, W is a process that may have triggered the execution of σ. We address this problem probabilistically, and propose a framework that builds a compact representation of Φ’s interpretations, each associated with a probability score. This probability measures how likely the associated interpretation is the correct one, and it is evaluated adopting a revision paradigm guided by the background knowledge provided by the processes’ models. Notably, our approach can deal with “complex” activities (i.e., each generating a sequence of low-level operations, rather than a single one), and with the case that the traces encode process instances exhibiting some deviation from the expected behaviors encoded in the process models.
Read full abstract