Abstract

Process mining has been successfully applied in the healthcare domain and has helped to uncover various insights for improving healthcare processes. While the benefits of process mining are widely acknowledged, many people rightfully have concerns about irresponsible uses of personal data. Healthcare information systems contain highly sensitive information and healthcare regulations often require protection of data privacy. The need to comply with strict privacy requirements may result in a decreased data utility for analysis. Until recently, data privacy issues did not get much attention in the process mining community; however, several privacy-preserving data transformation techniques have been proposed in the data mining community. Many similarities between data mining and process mining exist, but there are key differences that make privacy-preserving data mining techniques unsuitable to anonymise process data (without adaptations). In this article, we analyse data privacy and utility requirements for healthcare process data and assess the suitability of privacy-preserving data transformation methods to anonymise healthcare data. We demonstrate how some of these anonymisation methods affect various process mining results using three publicly available healthcare event logs. We describe a framework for privacy-preserving process mining that can support healthcare process mining analyses. We also advocate the recording of privacy metadata to capture information about privacy-preserving transformations performed on an event log.

Highlights

  • Technological advances in the fields of business intelligence and data science empower organisations to become “data-driven” by applying new techniques to analyse large amounts of data

  • We use the anonymised event logs and show how the anonymisation methods affect the results of process mining approaches that are frequently used in the healthcare domain: process discovery, process conformance analysis, process performance analysis, organisational mining, and process variant analysis

  • Generalisation of all timestamps did not have any effect on the results of process discovery and process conformance analysis plugins that take as input activity sequences; it affected the results of process performance analysis; Activity suppression, on the other hand, had a minimal effect on the average case throughput time; affected process discovery and conformance analysis results in some logs; Activity suppression affected many events in the BPIC11 log and few events in the other logs;

Read more

Summary

Introduction

Technological advances in the fields of business intelligence and data science empower organisations to become “data-driven” by applying new techniques to analyse large amounts of data. Process mining is a specialised form of data-driven analytics where process data, collated from different IT systems typically available in organisations, are analysed to uncover the real behaviour and performance of business operations [1]. Process mining was successfully applied in the healthcare domain and helped to uncover insights for improving operational efficiency of healthcare processes and evidence-informed decision making [2,3,4,5,6]. Healthcare data can include highly sensitive attributes (e.g., patient health outcomes/diagnoses, and the type of treatments being undertaken). Privacy of such data needs to be protected.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call