Medication Extraction from Electronic Clinical Notes in an Integrated Health System: A Study on Aspirin Use in Patients with Nonvalvular Atrial Fibrillation

Chengyi Zheng,Nazia Rashid,River Koblick,Jaejin An

doi:10.1016/j.clinthera.2015.07.002

Abstract

PurposeThe purpose of this study was to investigate whether aspirin use can be captured from the clinical notes in a nonvalvular atrial fibrillation population. MethodsA total of 29,507 patients with newly diagnosed nonvalvular atrial fibrillation were identified from January 1, 2006, through December 31, 2011, and were followed up through December 31, 2012. More than 3 million clinical notes were retrieved from electronic medical records. A training data set of 2949 notes was created to develop a computer-based method to automatically extract aspirin use status and dosage information using natural language processing (NLP). A gold standard data set of 5339 notes was created using a blinded manual review. NLP results were validated against the gold standard data set. The aspirin data from the structured medication databases were also compared with the results from NLP. Positive and negative predictive values, along with sensitivity and specificity, were calculated. FindingsNLP achieved 95.5% sensitivity and 98.9% specificity when compared with the gold standard data set. The positive predictive value was 93.0%, and the negative predictive value was 99.3%. NLP identified aspirin use for 83.8% of the study population, and 70% of the low dose aspirin use was identified only by the NLP method. ImplicationsWe developed and validated an NLP method specifically designed to identify low dose aspirin use status from the clinical notes with high accuracy. This method can be a valuable tool to supplement existing structured medication data.

Full Text