Abstract

Understanding the spectrum of risk factors affecting a patient’s specific disease can help personalize disease screening recommendations. Is it possible to use synthetic patient data to develop a novel graph methodology for visualizing and identifying potential risk factors for any given disease? This study simulated an electronic medical record (EMR) database with 150K synthetic patients generated by the Synthea tool; it constructed the patient health factor graph in the Neo4j graph database for lung cancer as an example target disease. The graph contained over 990,000 relations between the factor nodes and the patient nodes, and it was connected to Unified Medical Language System (UMLS) lung cancer subgraph. Such an integrated biomedical graph was able to view and compare patient health factors and biomedical knowledge in the same graph. Through graph search, the connection delta ratio (CDR) was calculated for each factor node as a simple measure of factor–disease relationship strength. Ranking health factors by CDR produced a distribution of potential risk factors. The top-ranked factors were largely verified by and consistent with reports in literature, demonstrating the validity of this graph method. Once validated in real patient data, the new graph method may have significant implications in identifying potential risk factors for personalized disease screening recommendations and personalized medicine.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call