Process mining holds promise for analysing longitudinal data in clinical epidemiology, yet its application remains limited. The objective of this study was to propose and evaluate a methodology for applying process mining techniques in observational clinical epidemiology. We propose a methodology that integrates a cohort study design with data-driven process mining, with an eight-step approach, including data collection, data extraction and curation, event-log generation, process discovery, process abstraction, hypothesis generation, statistical testing, and prediction. These steps facilitate the discovery of disease progression patterns. We implemented our proposed methodology in a cohort study comparing new users of proton pump inhibitors (PPI) and H2 blockers (H2B). PPI usage was associated with a higher risk of disease progression compared to H2B usage, including a greater than 30% decline in estimated Glomerular Filtration Rate (eGFR) (Hazard Ratio [HR] 1.6, 95% Confidence Interval [CI] 1.4-1.8), as well as increased all-cause mortality (HR 3.0, 95% CI 2.1-4.4). Furthermore, we investigated the associations between each transition and covariates such as age, gender, and comorbidities, offering deeper insights into disease progression dynamics. Additionally, a risk prediction tool was developed to estimate the transition probability for an individual at a future time. The proposed methodology bridges the gap between process mining and epidemiological studies, providing a useful approach to investigating disease progression and risk factors. The synergy between these fields enhances the depth of study findings and fosters the discovery of new insights and ideas.
Read full abstract