Abstract

Predictive process monitoring aims at forecasting the behavior, performance, and outcomes of business processes at runtime. It helps identify problems before they occur and re-allocate resources before they are wasted. Although deep learning (DL) has yielded breakthroughs, most existing approaches build on classical machine learning (ML) techniques, particularly when it comes to outcome-oriented predictive process monitoring. This circumstance reflects a lack of understanding about which event log properties facilitate the use of DL techniques. To address this gap, the authors compared the performance of DL (i.e., simple feedforward deep neural networks and long short term memory networks) and ML techniques (i.e., random forests and support vector machines) based on five publicly available event logs. It could be observed that DL generally outperforms classical ML techniques. Moreover, three specific propositions could be inferred from further observations: First, the outperformance of DL techniques is particularly strong for logs with a high variant-to-instance ratio (i.e., many non-standard cases). Second, DL techniques perform more stably in case of imbalanced target variables, especially for logs with a high event-to-activity ratio (i.e., many loops in the control flow). Third, logs with a high activity-to-instance payload ratio (i.e., input data is predominantly generated at runtime) call for the application of long short term memory networks. Due to the purposive sampling of event logs and techniques, these findings also hold for logs outside this study.

Highlights

  • Gaining knowledge from data is an emergent topic in many disciplines (Hashem et al 2015), high on many organizations’ agendas, and a macro-economic game-changer (Lund et al 2013)

  • O3: deep learning (DL) classifiers substantially outperform classical machine learning (ML) classifiers regarding ROC AUC for logs with a high eventto-activity ratio and imbalanced class labels For production log (PL), BPI Challenge 2011 (BPIC11), and BPI Challenge 2013 (BPIC13), we found that DL leads to a considerably higher ROC AUC score, which points to a more balanced classification in terms of less alpha and beta errors

  • There is a lack of knowledge related to which log properties facilitate the use of DL techniques in this domain

Read more

Summary

Introduction

Gaining knowledge from data is an emergent topic in many disciplines (Hashem et al 2015), high on many organizations’ agendas, and a macro-economic game-changer (Lund et al 2013). Kratsch et al.: Machine Learning in Business Process Monitoring..., Bus Inf Syst Eng 63(3):261276 (2021) These activities are supported by process-aware information systems that record events and additional attributes, e.g., resources or process outcomes (van der Aalst et al 2011a). We investigate the following research question: Which event log properties facilitate the use of DL techniques for outcome-oriented predictive process monitoring? To obtain transferable results and related propositions, we combined data-to-description (Level-1 inference) and description-totheory (Level-2 inference) generalization, as included in Lee and Baskerville (2003) generalization framework for information systems research This required to purposively sample both techniques and logs.

Data-Driven Approaches in Business Process Management
Machine Learning as a Predictive Process Monitoring Technique
Performance Evaluation of Machine Learning Classifiers
Study Design
Description of the Used Event Logs
Classification of the Used Event Logs
Labeling of Process Instances
Sequence Encoding
Implementing the Classifiers
Result
Summary
Implications
Findings
Limitations and Future Research
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call