Adversarial System Variant Approximation to Quantify Process Model Generalization

Julian Theis,Houshang Darabi

doi:10.1109/access.2020.3033450

Abstract

In process mining, process models are extracted from event logs using process discovery algorithms and are commonly assessed using multiple quality dimensions. While the metrics that measure the relationship of an extracted process model to its event log are well-studied, quantifying the level by which a process model can describe the unobserved behavior of its underlying system falls short in the literature. In this paper, a novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue. Sequence Generative Adversarial Networks are trained on the variants contained in an event log with the intention to approximate the underlying variant distribution of the system behavior. Unobserved realistic variants are sampled either directly from the Sequence Generative Adversarial Network or by leveraging the Metropolis-Hastings algorithm. The degree by which a process model relates to its underlying unknown system behavior is then quantified based on the realistic observed and estimated unobserved variants using established process model quality metrics. Significant performance improvements in revealing realistic unobserved variants are demonstrated in a controlled experiment on 15 ground truth systems. Additionally, the proposed methodology is experimentally tested and evaluated to quantify the generalization of 60 discovered process models with respect to their systems.

Highlights

Process mining is a comparatively young research discipline that delights itself on ever-increasing popularity with applications in multiple domains such as Healthcare [1], Manufacturing [2], Robotic Process Automation [3], HumanComputer Interaction [4], and Simulation [5]
The visualization of the number of generated variants show that the Sequence Generative Adversarial Network (SGAN) are generating a number of variants closer to the ground truth compared to Petri net (PN) s across all datasets
The answer to Q1 is that the Adversarial System Variant Approximation (AVATAR) sampling methodology approximates the true number of system variants better than process models that are built from stateof-the-art discovery algorithms

Summary

Introduction

Process mining is a comparatively young research discipline that delights itself on ever-increasing popularity with applications in multiple domains such as Healthcare [1], Manufacturing [2], Robotic Process Automation [3], HumanComputer Interaction [4], and Simulation [5]. Process mining techniques are leveraged to analyze a system. The system is observed during runtime and process steps are recorded sequentially. Recordings are used to discover process models that are supposed to provide insights into the underlying system, i.e. to describe its behavior. An illustrative example is a complex manufacturing plant of a given product. Such a system usually requires machines to be filled with raw materials, components to be moved and assembled, and final products to be placed on pallets. Examples of corresponding observable process steps are conveyor start, start

Methods

Results

Conclusion