Business process similarity measures are of vital importance for process repository management applications, such as process query, process recommendation, and process clustering. Most existing approaches measure process similarity by relying on control-flow structures only. This article investigates the role of data in process similarity measure. To incorporate data-flow information into business process control flow, it proposes a data-aware workflow net (DWF-net) by extending the classical workflow net with data reading and writing semantics. Then, we introduce three types of similarity measures, i.e., data item set-based similarity, data operation set-based similarity, and data-aware behavior-based similarity, to quantify the similarity of data-aware business processes from different perspectives. Next, a methodology is introduced to help process analysts apply these three measures in a systematical way. Finally, we evaluate the effectiveness and applicability of the proposed similarity measures by a group of comparative experiments.
Read full abstract