Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.

Aryan Arbabi,Sanja Fidler,David R Adams,Michael Brudno

doi:10.2196/12596

Abstract

BackgroundAutomatic recognition of medical concepts in unstructured text is an important component of many clinical and research applications, and its accuracy has a large impact on electronic health record analysis. The mining of medical concepts is complicated by the broad use of synonyms and nonstandard terms in medical documents.ObjectiveWe present a machine learning model for concept recognition in large unstructured text, which optimizes the use of ontological structures and can identify previously unobserved synonyms for concepts in the ontology.MethodsWe present a neural dictionary model that can be used to predict if a phrase is synonymous to a concept in a reference ontology. Our model, called the Neural Concept Recognizer (NCR), uses a convolutional neural network to encode input phrases and then rank medical concepts based on the similarity in that space. It uses the hierarchical structure provided by the biomedical ontology as an implicit prior embedding to better learn embedding of various terms. We trained our model on two biomedical ontologies—the Human Phenotype Ontology (HPO) and Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT).ResultsWe tested our model trained on HPO by using two different data sets: 288 annotated PubMed abstracts and 39 clinical reports. We achieved 1.7%-3% higher F1-scores than those for our strongest manually engineered rule-based baselines (P=.003). We also tested our model trained on the SNOMED-CT by using 2000 Intensive Care Unit discharge summaries from MIMIC (Multiparameter Intelligent Monitoring in Intensive Care) and achieved 0.9%-1.3% higher F1-scores than those of our baseline. The results of our experiments show high accuracy of our model as well as the value of using the taxonomy structure of the ontology in concept recognition.ConclusionMost popular medical concept recognizers rely on rule-based models, which cannot generalize well to unseen synonyms. In addition, most machine learning methods typically require large corpora of annotated text that cover all classes of concepts, which can be extremely difficult to obtain for biomedical ontologies. Without relying on large-scale labeled training data or requiring any custom training, our model can be efficiently generalized to new synonyms and performs as well or better than state-of-the-art methods custom built for specific ontologies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JMIR medical informatics	Publication Date: May 10, 2019
Citations: 46	License type: cc-by

R Discovery Prime

R Discovery Prime

Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.

Abstract

Talk to us

Similar Papers

More From: JMIR medical informatics

Lead the way for us

Similar Papers

Developing an Exercise Games Ontology Towards Standard Rehabilitation Clinical Terms
Zul Hilmi Abdullah ... Lailatul Qadri Zakaria
-
Zul Hilmi Abdullah, et. al.Zul Hilmi Abdullah ... Lailatul Qadri Zakaria
01 Dec 2022
01 Dec 2022

Automatic Stroke Medical Ontology Augmentation with Standard Medical Terminology and Unstructured Textual Medical Knowledge
Soonhyun Kwon ... Jaehak Yu
-
Soonhyun Kwon, et. al.Soonhyun Kwon ... Jaehak Yu
23 Aug 2021
23 Aug 2021

Analysis of Causal Relationships in Integrated Ontologies of Diseases, Phenotypes, and Radiological Diagnosis.
Charles E Kahn Jr
Studies in health technology and informatics | VOL. 290
Charles E Kahn JrCharles E Kahn Jr
06 Jun 2022
Studies in health technology and informatics | VOL. 290

Biomedical ontologies and their development, management, and applications in and beyond China
Hongjie Pan ... Yan Zhu
Journal of Bio-X Research | VOL. 2
Hongjie Pan, et. al.Hongjie Pan ... Yan Zhu
01 Dec 2019
Journal of Bio-X Research | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.

Abstract

Talk to us

Similar Papers

More From: JMIR medical informatics