Abstract

Information systems support the execution of business processes. As part of that, data about process execution is recorded in event logs, which can be used to analyse the control-flow of the respective processes. However, such data may contain personal information on process stakeholders that is protected by privacy regulations. Process analysis based on event logs shall, therefore, employ anonymization techniques. In this paper, we introduce two approaches to anonymize the recorded control-flow of a process. Specifically, we present SaCoFa and SaPa as two techniques to anonymize the result of trace-variant queries over an event log. Unlike existing techniques that achieve differential privacy through randomized noise insertion, our techniques rely on noise insertion mechanisms that incorporate a process’ semantics, thereby avoiding easily-recognizable noise. Both techniques take different design choices, though. SaCoFa anonymizes a trace-variant distribution directly, thereby focusing on utility preservation at the expense of potentially changing the number of a traces in the result considerably. SaPa, in turn, anonymizes a trace-variant distribution indirectly, through play-out of an anonymized directly-follows distribution. This way, the number of traces in the result is close to the original log, but the drop in utility may become larger due to using only local control-flow information. However, our experiments demonstrate that both approaches strike a better balance of preserving the utility of an event log compared to existing techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call