AbstractDecision support systems are being developed to assist clinicians in complex decision-making processes by leveraging information from clinical knowledge and electronic health records (EHRs). One typical application is disease risk prediction, which can be challenging due to the complexity of modelling longitudinal EHR data, including unstructured medical notes. To address this challenge, we propose a deep state-space model (DSSM) that simulates the patient’s state transition process and formally integrates latent states with risk observations. A typical DSSM consists of three parts: a prior module that generates the distribution of the current latent state based on previous states; a posterior module that approximates the latent states using up-to-date medical notes; and a likelihood module that predicts disease risks using latent states. To efficiently and effectively encode raw medical notes, our posterior module uses an attentive encoder to better extract information from unstructured high-dimensional medical notes. Additionally, we couple a predictive clustering algorithm into our DSSM to learn clinically useful representations of patients’ latent states. The latent states are clustered into multiple groups, and the weighted average of the cluster centres is used for prediction. We demonstrate the effectiveness of our deep clustering-based state-space model using two real-world EHR datasets, showing that it not only generates better risk prediction results than other baseline methods but also clusters similar patient health states into groups.
Read full abstract