A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study.

Jimyung Park,Jae Youn Cheong,Jin Wook Choi,Seng Chan You,Mira Kang,Jin Roh,Rae Woong Park,Dong Yun Lee,Eugene Jeong,Dongsu Park,Chunhua Weng

doi:10.2196/23983

Abstract

BackgroundAlthough electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions.ObjectiveThis study aimed to develop a framework for processing unstructured clinical documents of EHRs and integration with standardized structured data.MethodsWe developed a framework known as Staged Optimization of Curation, Regularization, and Annotation of clinical text (SOCRATex). SOCRATex has the following four aspects: (1) extracting clinical notes for the target population and preprocessing the data, (2) defining the annotation schema with a hierarchical structure, (3) performing document-level hierarchical annotation using the annotation schema, and (4) indexing annotations for a search engine system. To test the usability of the proposed framework, proof-of-concept studies were performed on EHRs. We defined three distinctive patient groups and extracted their clinical documents (ie, pathology reports, radiology reports, and admission notes). The documents were annotated and integrated into the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) database. The annotations were used for creating Cox proportional hazard models with different settings of clinical analyses to measure (1) all-cause mortality, (2) thyroid cancer recurrence, and (3) 30-day hospital readmission.ResultsOverall, 1055 clinical documents of 953 patients were extracted and annotated using the defined annotation schemas. The generated annotations were indexed into an unstructured textual data repository. Using the annotations of pathology reports, we identified that node metastasis and lymphovascular tumor invasion were associated with all-cause mortality among colon and rectum cancer patients (both P=.02). The other analyses involving measuring thyroid cancer recurrence using radiology reports and 30-day hospital readmission using admission notes in depressive disorder patients also showed results consistent with previous findings.ConclusionsWe propose a framework for hierarchical annotation of textual data and integration into a standardized OMOP-CDM medical database. The proof-of-concept studies demonstrated that our framework can effectively process and integrate diverse clinical documents with standardized structured data for clinical research.

Highlights

BackgroundWith the universal adoption of electronic health records (EHRs), the secondary use of electronic health record Fast Healthcare Interoperability Resources (FHIR) (EHR) becomes important for translational research and improvement of the quality of health care [1,2,3]
We propose a framework for hierarchical annotation of textual data and integration into a standardized Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) medical database
In an international open science initiative, Observational Health Data Sciences and Informatics (OHDSI), the structured data of more than 200 hospitals worldwide were mapped into a standardized vocabulary and data structure referred to as the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) [4]

Summary

Introduction

BackgroundWith the universal adoption of electronic health records (EHRs), the secondary use of EHRs becomes important for translational research and improvement of the quality of health care [1,2,3]. Structured data have been widely utilized owing to their processable and standardized codes. In an international open science initiative, Observational Health Data Sciences and Informatics (OHDSI), the structured data of more than 200 hospitals worldwide were mapped into a standardized vocabulary and data structure referred to as the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) [4]. Using the OMOP-CDM, OHDSI has generated medical evidence through large-scale observational research [5], which can be achieved by the software and user interface to facilitate standardized phenotyping [6], statistical analysis [7], and machine-learning application [8]. Electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JMIR Medical Informatics	Publication Date: Mar 30, 2021
Citations: 13	License type: cc-by

R Discovery Prime

R Discovery Prime

A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: JMIR Medical Informatics

Lead the way for us

Similar Papers

Annotating Cohort Data Elements with OHDSI Common Data Model to Promote Research Reproducibility
Yiqing Zhao ... Jennifer St Sauver
-
Yiqing Zhao, et. al.Yiqing Zhao ... Jennifer St Sauver
01 Dec 2018
01 Dec 2018

Facilitating phenotype transfer using a common data model
George Hripcsak ...
Journal of Biomedical Informatics | VOL. 96
George Hripcsak, et. al.George Hripcsak ...
17 Jul 2019
Journal of Biomedical Informatics | VOL. 96

Eos and OMOCL: Towards a seamless integration of openEHR records into the OMOP Common Data Model
Severin Kohler ... Roland Eils
Journal of Biomedical Informatics | VOL. 144
Severin Kohler, et. al.Severin Kohler ... Roland Eils
12 Jul 2023
Journal of Biomedical Informatics | VOL. 144

An Evaluation of the THIN Database in the OMOP Common Data Model for Active Drug Safety Surveillance
Xiaofeng Zhou ... Qing Liu
Drug Safety | VOL. 36
Xiaofeng Zhou, et. al.Xiaofeng Zhou ... Qing Liu
04 Jan 2013
Drug Safety | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: JMIR Medical Informatics