Abstract
Process mining techniques have been used to analyze event logs from information systems in order to derive useful patterns. However, in the big data era, real-life event logs are huge, unstructured, and complex so that traditional process mining techniques have difficulties in the analysis of big logs. To reduce the complexity during the analysis, trace clustering can be used to group similar traces together and to mine more structured and simpler process models for each of the clusters locally. However, a high dimensionality of the feature space in which all the traces are presented poses different problems to trace clustering. In this paper, we study the effect of applying dimensionality reduction (preprocessing) techniques on the performance of trace clustering. In our experimental study we use three popular feature transformation techniques; singular value decomposition (SVD), random projection (RP), and principal components analysis (PCA), and the state-of-the art trace clustering in process mining. The experimental results on the dataset constructed from a real event log recorded from patient treatment processes in a Dutch hospital show that dimensionality reduction can improve trace clustering performance with respect to the computation time and average fitness of the mined local process models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.