Abstract

Background: Natural language processing (NLP) is a powerful tool supporting the generation of Real-World Evidence (RWE). There is no NLP system that enables the extensive querying of parameters specific to multiple myeloma (MM) out of unstructured medical reports. We therefore created a MM-specific ontology to accelerate the information extraction (IE) out of unstructured text. Methods: Our MM ontology consists of extensive MM-specific and hierarchically structured attributes and values. We implemented “A Rule-based Information Extraction System” (ARIES) that uses this ontology. We evaluated ARIES on 200 randomly selected medical reports of patients diagnosed with MM. Results: Our system achieved a high F1-Score of 0.92 on the evaluation dataset with a precision of 0.87 and recall of 0.98. Conclusions: Our rule-based IE system enables the comprehensive querying of medical reports. The IE accelerates the extraction of data and enables clinicians to faster generate RWE on hematological issues. RWE helps clinicians to make decisions in an evidence-based manner. Our tool easily accelerates the integration of research evidence into everyday clinical practice.

Highlights

  • Multiple Myeloma (MM) is the third most common hematological malignancy in Germany [1]

  • After optimizing the ontology on the training and validation set, an evaluation run was performed on the evaluation set

  • These reports are used by treating doctors of different departments or are sent to the general practitioners (GPs) of patients in order to update the GPs on the medical history and condition of shared patients

Read more

Summary

Introduction

Multiple Myeloma (MM) is the third most common hematological malignancy in Germany [1]. One source is routinely collected patient information It is a laborious and time-consuming task to extract data from this source for Real-World Evidence (RWE) analysis. RWE is defined as “the technology-facilitated collation of all routinely collected information on patients from clinical systems to a comprehensive, homogeneously analyzable dataset (big data) that reflects the treatment reality in the best possible and comparable manner” [2]. There is no NLP system that enables the extensive querying of parameters specific to multiple myeloma (MM) out of unstructured medical reports. We created a MM-specific ontology to accelerate the information extraction (IE) out of unstructured text. The IE accelerates the extraction of data and enables clinicians to faster generate RWE on hematological issues. Our tool accelerates the integration of research evidence into everyday clinical practice

Objectives
Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call