MedXN: an open source medication extraction and normalization tool for clinical text.

Sunghwan Sohn,Hongfang Liu,Cheryl Clark,Christopher G Chute,Sean P Murphy,Scott R Halgrim

doi:10.1136/amiajnl-2013-002190

Abstract

We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Medication descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. The MedXN system (http://sourceforge.net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.

Full Text