Abstract

The quality of event log data is a constraining factor in achieving reliable insights in process mining. Particular quality problems are posed by activity labels which are meant to be representative of organisational activities, but may take different manifestations (e.g. as a result of manual entry synonyms may be introduced). Ideally, such problems are remedied by domain experts, but they are time-poor and data cleaning is a time-consuming and tedious task. Ontologies provide a means to formalise domain knowledge and their use can provide a scalable solution to fixing activity label similarity problems, as they can be extended and reused over time. Existing approaches to activity label quality improvement use manually-generated ontologies or ontologies that are too general (e.g. WordNet). Limited attention has been paid to facilitating the development of purposeful ontologies in the field of process mining. This paper is concerned with the creation of activity ontologies by domain experts. For the first time in the field of process mining, their participation is facilitated and motivated through the application of techniques from crowdsourcing and gamification. Evaluation of our approach to the construction of activity ontologies by 35 participants shows that they found the method engaging and that its application results in high-quality ontologies.

Highlights

  • P ROCESS mining is the analysis of event logs, historical process data, which generates actionable insights to improve processes by business owners [1]

  • While existing approaches focus on synonymous labels, the approach proposed in this paper identifies synonymy and other types of semantic relations that are hypernymy, holonymy, and antonymy between activity labels through the use of a crowdsourced gamified system

  • The semantic relations distinguished in this approach, e.g., synonymy, antonymy, hypernymy, and holonymy, are general and can exist between activity labels from different domains, as they are adapted from the semantic relations defined in WordNet [16]

Read more

Summary

Introduction

P ROCESS mining is the analysis of event logs, historical process data, which generates actionable insights to improve processes by business owners [1]. The use of poor-quality event logs leads to unreliable analysis results and insights (i.e., garbage in – garbage out) [2]. Of critical importance are the activities that are performed in a process and their correct identification requires a deep insight into the domain involved. The same activity in a process might be referred to by different labels, a case of so-called synonymous labels [3]. Different abstraction levels of activity labels (i.e., too detailed vs too general) are quite common [4]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.