An approach for medical event detection in Chinese clinical notes of electronic health records

Xuesi Zhou,Sihan Zeng,Ji Wu,Haoqi Xiong,Xiangling Fu

doi:10.1186/s12911-019-0756-5

Abstract

BackgroundMedical event detection in narrative clinical notes of electronic health records (EHRs) is a task designed for reading text and extracting information. Most of the previous work of medical event detection treats the task as extracting concepts at word granularity, which omits the overall structural information of the clinical notes. In this work, we treat each clinical note as a sequence of short sentences and propose an end-to-end deep neural network framework.MethodsWe redefined the task as a sequence labelling task at short sentence granularity, and proposed a novel tag system correspondingly. The dataset were derived from a third-level grade-A hospital, consisting of 2000 annotated clinical notes according to our proposed tag system. The proposed end-to-end deep neural network framework consists of a feature extractor and a sequence labeller, and we explored different implementations respectively. We additionally proposed a smoothed Viterbi decoder as sequence labeller without additional parameter training, which can be a good alternative to conditional random field (CRF) when computing resources are limited.ResultsOur sequence labelling models were compared to four baselines which treat the task as text classification of short sentences. Experimental results showed that our approach significantly outperforms the baselines. The best result was obtained by using the convolutional neural networks (CNNs) feature extractor and the sequential CRF sequence labeller, achieving an accuracy of 92.6%. Our proposed smoothed Viterbi decoder achieved a comparable accuracy of 90.07% with reduced training parameters, and brought more balanced performance across all categories, which means better generalization ability.ConclusionsEvaluated on our annotated dataset, the comparison results demonstrated the effectiveness of our approach for medical event detection in Chinese clinical notes of EHRs. The best feature extractor is the CNNs feature extractor, and the best sequence labeller is the sequential CRF decoder. And it was empirically verified that our proposed smoothed Viterbi decoder could bring better generalization ability while achieving comparable performance to the sequential CRF decoder.

Highlights

Medical event detection in narrative clinical notes of electronic health records (EHRs) is a task designed for reading text and extracting information
The performance of the sequence labelling models is significantly higher than that of the text classification models, which indicates that the sequence labelling models can effectively exploit the context information to improve the accuracy, and verifies the correlation that exists between the short sentences
In this work, we have redefined the task of medical event detection in Chinese clinical notes of EHRs and proposed a novel tag system of short sentence granularity

Summary

Introduction

Medical event detection in narrative clinical notes of electronic health records (EHRs) is a task designed for reading text and extracting information. Most of the previous work of medical event detection treats the task as extracting concepts at word granularity, which omits the overall structural information of the clinical notes. Medical event detection in narrative clinical notes of EHRs is a task designed for reading text and extracting information. These work mainly focus on the concepts at word granularity, such as drug names, adverse drug events (ADEs), indications, and attributes of these concepts. This manner omits the overall structural information of the medical record. Due to the higher density of the Chinese language compared with English [7], the medical events in Chinese clinical notes usually covers several consecutive short sentences, leading to that medical event detection is more unsuitable at word granularity

Methods

Results

Discussion

Conclusion