Abstract

Existing process mining methodologies, while noting the importance of data quality, do not provide details on how to assess the quality of event data and how the identification of data quality issues can be exploited in the planning, data extraction and log building phases of any process mining analysis. To this end we adapt CRISP-DM [15] to supplement the Planning phase of the PM\(^2\) [6] process mining methodology to specifically include data understanding and quality assessment. We illustrate our approach in a case study describing the detailed preparation for a process mining analysis of ground and aero-medical pre-hospital transport processes involving the Queensland Ambulance Service (QAS) and Retrieval Services Queensland (RSQ). We utilise QAS and RSQ sample data to show how the use of data models and some quality metrics can be used to (i) identify data quality issues, (ii) anticipate and explain certain observable features in process mining analyses, (iii) distinguish between systemic and occasional quality issues, and, (iv) reason about the mechanisms by which identified quality issues may have arisen in the event log. We contend that this knowledge can be used to guide the extraction, pre-processing stages of a process mining case study.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call