Abstract
This paper addresses the problem of process discovery from large and complex event logs. We depart from the existing literature and formulate the problem of optimal process discovery. A formal mathematical programming model is given based on a novel hierarchical structuration of the event logs. Desired properties of event trace score functions are described, and the properties of optimal process models are proved. A combination of Monte Carlo optimization and tabu search is proposed to overcome the complexity related to the huge size of the event logs and the combinatorial solution space. Numerical results show that our approach is suitable for large event logs and that it performs better than the state-of-the-art approaches. We also demonstrate the applicability of our method on a real case study in health care. This paper illustrates the benefits of combining techniques from the operational research and the process mining fields. Note to Practitioners —Though directly applicable to general business process discovery, this paper is motivated by our collaboration with the company HEVA (Lyon, France) and health practitioners on patient care pathway discovery. The French hospitalization database that contains hospitalization history of all patients is used for this purpose and our goal is to determine the most meaningful process model of the patient hospitalization history. The hierarchical event structure of this paper provides a natural way of representing relationship between hospitalization events and is readily obtainable from the classical ICD-10 codes. The formal mathematical model and our optimization algorithm allow the end users to best balance between the faithfulness of the process model and its complexity. A case study of cardiovascular patients is presented to show the capability of the proposed approach to clearly capture the major patient pathways before and after the implementation of defibrillators. The results of this paper are highly valuable for doctors and public health decision makers, as crucial information is provided on patient care pathways for any selected cohort.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have