ObjectiveTo provide an open-source software package for determining temporal correlations between disease states using longitudinal electronic medical records (EMR).Materials and MethodsWe have developed an R-based package, Disease Correlation Network (DCN), which builds retrospective matched cohorts from longitudinal medical records to assess for significant temporal correlations between diseases using two independent methodologies: Cox proportional hazards regression and random forest survival analysis. This optimizable package has the potential to control for relevant confounding factors such as age, gender, and other demographic and medical characteristics. Output is presented as a DCN which may be analyzed using a JavaScript-based interactive visualization tool for users to explore statistically significant correlations between disease states of interest using graph-theory-based network topology.ResultsWe have applied this package to a longitudinal dataset at Loyola University Chicago Medical Center with 654 084 distinct initial diagnoses of 51 conditions in 175 539 patients. Over 90% of disease correlations identified are supported by literature review. DCN is available for download at https://github.com/qunfengdong/DCN.ConclusionsDCN allows screening of EMR data to identify potential relationships between chronic disease states. This data may then be used to formulate novel research hypotheses for further characterization of these relationships.
Read full abstract