Abstract

Background: Electronic health records (EHR) play an important role for the redefinition of phenotypes in view of the wealth and heterogeneity of information now available from disparate data sources. A recent cross-sectional retrospective study has described the potential of EHR toward type 2 diabetes mellitus (T2D) screening when ad hoc models are used. About 10,000 US patients have been analyzed through a variety of inference techniques applied to all records with a variable degree of completeness. The analyses conducted in the reference study have indicated that EHR phenotypes significantly improved T2D detection.Methods: With these US patients and the T2D data evidenced in the above study, we propose an integrative inference approach that leverages the prediction power of EHR features selected by two well-known methods, Random Forests and Lasso. The goal is 2-fold: reducing the Big Data redundancies potentially harmful to the predictive learning task and exploiting the interconnectivity of EHR features. A mutual information (MI) network is the inference tool used to identify communities useful to prioritize significant T2D features underlying the similarity between patients.Results: Endowed with a different degree of granularity, the communities detected after the application of both methods were centered especially on T2D comorbidities and risk factors. As such, they appear very relevant for assessment of two main issues, T2D disease burden, and prevention.Conclusions: Our analytical approach offers a solution for managing the EHR scale factor in a complex disease context. EHR are rich sources of phenotypic diversity through which novel stratifications of patients are expected. To enable these results, both pre-screening of variables and calibration of risk prediction methods become necessary steps in EHR analyses. We have presented networks identifying major T2D communities. The specific significance assigned to comorbidities and risk factors in relation to T2D can be inferred with accuracy from just a suitably reduced number of EHR features.

Highlights

  • IntroductionType 2 diabetes (T2D) is a chronic condition that affects how our body metabolizes sugar or glucose, inducing either resistance to the effects of insulin, or lack of its production in a way sufficient to maintain normal glucose levels

  • Diabetes is a disorder traditionally subdivided into two types

  • Our results indicate that the redundancy observed with LA is well reflected into the communities

Read more

Summary

Introduction

Type 2 diabetes (T2D) is a chronic condition that affects how our body metabolizes sugar or glucose, inducing either resistance to the effects of insulin, or lack of its production in a way sufficient to maintain normal glucose levels. Several mechanisms can lead to diabetes, and these can be modified by genetic, lifestyle, and environmental factors. All such factors make T2D a very heterogeneous disease, one for which many types of data should be analyzed for achieving superior precision of diagnoses and therapies. The analyses conducted in the reference study have indicated that EHR phenotypes significantly improved T2D detection

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call