Abstract

Electronic health records (EHRs) from type 2 diabetes (T2D) patients consist of longitudinally and sparsely measured health markers at clinical encounters. Our goal is to use such data to learn latent patterns that can inform patient's health status related to T2D while accounting for challenges in retrospectively collected EHRs. To handle challenges such as correlated longitudinal measurements, irregular and informative encounter times, and mixed marker types, we propose multivariate generalized linear models to learn latent patient subgroups. In our model, covariate effects were time-dependent and latent Gaussian processes were introduced to model between-marker correlations over time. Using inferred latent processes, we integrated the irregularly measured health markers of mixed types into composite scores and applied hierarchical clustering to learn latent subgroup structures among T2D patients. Application to an EHR dataset of T2D patients showed different trends of age, sex, and race effects on hypertension/high blood pressure, total cholesterol, glycated hemoglobin, high-density lipoprotein, and medications. The associations among these markers varied over time during the study window. Clustering results revealed four subgroups, each with distinct health status. The same patterns were further confirmed using new EHR records of the same cohort. We developed a novel latent model to integrate longitudinal health markers in EHRs and characterize patient latent heterogeneities. Analysis indicated that there were distinct subgroups of T2D patients, suggesting that effective healthcare managements for these patients should be performed separately for each subgroup.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call