Abstract

BackgroundThe electronic medical record (EMR) has become a modern compendium of health information, from broad clinical assessments down to an individual’s heart rate. The wealth of information in these EMRs hold promise for clinical discovery and hypothesis generation. Unfortunately, as these systems have become more robust, mining them for relevant clinical information is hindered by the overall data architecture, and often requires the expertise of a clinical informatician to extract relevant data. However, as the information presented to the clinician through the digital workspace is derived from the core EMR database, the format is well structured and can be mined using text recognition and parsing scripts.MethodsHere we present a program which can parse output from Epic Hyperspace®, generating a relational database of clinical information. To facilitate ease of use, our protocol capitalizes on the familiarity of Microsoft Excel® as an intermediary for storing the raw output from the EMR, with data parsing and processing scripts written in SAS V9.4 (Cary, North Carolina).ResultsAs a proof of concept, we extracted the diagnosis codes and standard laboratories for 190 patients seen in our Congenital Cytomegalovirus Clinic at Texas Children’s Hospital in Houston, Texas. Manual extraction of these data into Microsoft Excel® took 1 hour, and the scripts to parse the data took less than 5 seconds to run. Data from these patients included: 3800 ICD-10 codes (along with their metadata) and 33,000 individual laboratory values. In total, more than 850,000 characters were extracted from the EMR using this technique. Manual review of 10 randomly selected charts, found the data in perfect concordant with the EMR, a direct reflection of the fidelity of the parsing scripts. On average, an experienced user was able to enter three ICD-10 codes each minute, and six individual laboratory values per minute. At best, this same process would have taken at least 110 hours using a conventional chart review technique.ConclusionHigh-throughput data mining tools have the potential to improve the feasibility of studies dependent upon information stored in the EMR. When coupled with specific content knowledge, this approach can consolidate months of data collection into a day’s task.Disclosures All authors: No reported disclosures

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.