Abstract

ObjectiveTo optimally leverage the scalability and unique features of the electronic health records (EHR) for research that would ultimately improve patient care, we need to accurately identify patients and extract clinically meaningful measures. Using multiple sclerosis (MS) as a proof of principle, we showcased how to leverage routinely collected EHR data to identify patients with a complex neurological disorder and derive an important surrogate measure of disease severity heretofore only available in research settings.MethodsIn a cross-sectional observational study, 5,495 MS patients were identified from the EHR systems of two major referral hospitals using an algorithm that includes codified and narrative information extracted using natural language processing. In the subset of patients who receive neurological care at a MS Center where disease measures have been collected, we used routinely collected EHR data to extract two aggregate indicators of MS severity of clinical relevance multiple sclerosis severity score (MSSS) and brain parenchymal fraction (BPF, a measure of whole brain volume).ResultsThe EHR algorithm that identifies MS patients has an area under the curve of 0.958, 83% sensitivity, 92% positive predictive value, and 89% negative predictive value when a 95% specificity threshold is used. The correlation between EHR-derived and true MSSS has a mean R2 = 0.38±0.05, and that between EHR-derived and true BPF has a mean R2 = 0.22±0.08. To illustrate its clinical relevance, derived MSSS captures the expected difference in disease severity between relapsing-remitting and progressive MS patients after adjusting for sex, age of symptom onset and disease duration (p = 1.56×10−12).ConclusionIncorporation of sophisticated codified and narrative EHR data accurately identifies MS patients and provides estimation of a well-accepted indicator of MS severity that is widely used in research settings but not part of the routine medical records. Similar approaches could be applied to other complex neurological disorders.

Highlights

  • With the increasing integration of electronic health records (EHR) into routine clinical care, there is an emerging interest in harnessing the wealth of EHR data for clinical research that improve patient care

  • The growing availability and functionality of the EHR system together with advances in natural language processing (NLP) and bioinformatics methods that are essential for extracting meaningful clinical information from the EHR data have converged to enable efficient and cost-effective development of EHR-derived patient cohorts and large-scale assessment of phenotypes relevant to patient care [1,2,4]

  • Important work led by the Electronic Medical Records and Genomics Network has further demonstrated the broad potential of EHR-based approaches in discovery and clinical research [15,16,17]

Read more

Summary

Introduction

With the increasing integration of electronic health records (EHR) into routine clinical care, there is an emerging interest in harnessing the wealth of EHR data for clinical research that improve patient care. Optimal use of EHR data for clinical research that would improve patient outcomes requires efficient extraction of meaningful information from codified data (e.g., demographics, billing codes for diagnoses and procedures, laboratory results, electronic prescriptions) and narrative data (e.g., clinical encounter notes, imaging reports) to accurately identify patient cohorts and measure clinically relevant outcomes [1,2]. The growing availability and functionality of the EHR system together with advances in natural language processing (NLP) and bioinformatics methods that are essential for extracting meaningful clinical information from the EHR data have converged to enable efficient and cost-effective development of EHR-derived patient cohorts and large-scale assessment of phenotypes relevant to patient care [1,2,4]. Important work led by the Electronic Medical Records and Genomics (eMERGE) Network has further demonstrated the broad potential of EHR-based approaches in discovery and clinical research [15,16,17]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call