Disease pathogenesis, a type of domain knowledge about biological mechanisms leading to diseases, has not been adequately encoded in machine-learning-based medical diagnostic models because of the inter-patient variabilities and complex dependencies of the underlying pathogenetic mechanisms. We propose 1) a novel pathogenesis probabilistic graphical model (PPGM) to quantify the dynamics underpinning patient-specific data and pathogenetic domain knowledge, 2) a Bayesian-based inference paradigm to answer the medical queries and forecast acute onsets. The PPGM model consists of two components: a Bayesian network of patient attributes and a temporal model of pathogenetic mechanisms. The model structure was reconstructed from expert knowledge elicitation, and its parameters were estimated using Variational Expectation-Maximization algorithms. We benchmarked our model with two well-established hidden Markov models (HMMs) – Input-output HMM (IO-HMM) and Switching Auto-Regressive HMM (SAR-HMM) – to evaluate the computational costs, forecasting performance, and execution time. Two case studies on Obstructive Sleep Apnea (OSA) and Paroxysmal Atrial Fibrillation (PAF) were used to validate the model. While the performance of the parameter learning step was equivalent to those of IO-HMM and SAR-HMM models, our model forecasting ability was outperforming those two models. The merits of the PPGM model are its representation capability to capture the dynamics of pathogenesis and perform medical inferences and its interpretability for physicians. The model has been used to perform medical queries and forecast the acute onset of OSA and PAF. Additional applications of the model include prognostic healthcare and preventive personalized treatments.
Read full abstract