Abstract

BackgroundThe volume and complexity of patient data – especially in personalised medicine – is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare.MethodsSemantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don’t provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository.ResultsA web-based tool called ODMedit (https://odmeditor.uni-muenster.de/) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal.ConclusionUniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.

Highlights

  • The volume and complexity of patient data – especially in personalised medicine – is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items are collected per patient in clinical trials

  • Semantic annotations enable ontology-based data integration: International terminologies like Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) [7] or a metathesaurus like the Unified Medical Language System (UMLS) [8] can help to specify the precise meaning of each data item, for instance UMLS code C0475440 corresponds to tumour size while C0005890 specifies body height

  • Operational Data Model (ODM) was selected as technical representation for patient data items with semantic annotation

Read more

Summary

Introduction

The volume and complexity of patient data – especially in personalised medicine – is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. There is a large variety of EHR systems [4], among other reasons because EHR data are typically collected in the local language of each country and because there are many specialised systems for certain disease domains These heterogeneous systems, combined with the high number of data items per study, pose significant challenges for data integration. Semantic annotations (i.e., semantic codes associated with data elements, called terminology bindings) enable ontology-based data integration: International terminologies like Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) [7] or a metathesaurus like the Unified Medical Language System (UMLS) [8] can help to specify the precise meaning of each data item, for instance UMLS code C0475440 corresponds to tumour size while C0005890 specifies body height. This is challenging given the huge number of medical terms and related homonyms as well as synonyms

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.