Automatic extraction of medication information from medical discharge summaries

Hui Yang

doi:10.1136/jamia.2010.003863

Abstract

This article describes a system developed for the 2009 i2b2 Medication Extraction Challenge. The purpose of this challenge is to extract medication information from hospital discharge summaries. The system explored several linguistic natural language processing techniques (eg, term-based and token-based rule matching) to identify medication-related information in the narrative text. A number of lexical resources was constructed to profile lexical or morphological features for different categories of medication constituents. Performance was evaluated in terms of the micro-averaged F-measure at the horizontal system level. The automated system performed well, and achieved an F-micro of 80% for the term-level results and 81% for the token-level results, placing it sixth in exact matches and fourth in inexact matches in the i2b2 competition. The overall results show that this relatively simple rule-based approach is capable of tackling multiple entity identification tasks such as medication extraction under situations in which few training documents are annotated for machine learning approaches, and the entity information can be characterized with a set of feature tokens.

Full Text