Abstract

The resident admit notes (RANs) in electronic medical records (EMRs) is first-hand information to study the patient’s condition. Medical entity extraction of RANs is an important task to get disease information for medical decision-making. For Chinese electronic medical records, each medical entity contains not only word information but also rich character information. Effective combination of words and characters is very important for medical entity extraction. We propose a medical entity recognition model based on a character and word attention-enhanced (CWAE) neural network for Chinese RANs. In our model, word embeddings and character-based embeddings are obtained through character-enhanced word embedding (CWE) model and Convolutional Neural Network (CNN) model. Then attention mechanism combines the character-based embeddings and word embeddings together, which significantly improves the expression ability of words. The new word embeddings obtained by the attention mechanism are taken as the input to bidirectional long short-term memory (BI-LSTM) and conditional random field (CRF) to extract entities. We extracted nine types of key medical entities from Chinese RANs and evaluated our model. The proposed method was compared with two traditional machine learning methods CRF, support vector machine (SVM), and the related deep learning models. The result shows that our model has better performance, and the result of our model reaches 94.44% in the F1-score.

Highlights

  • An electronic medical record (EMR) is a textual record of medical activities [1,2]

  • For medical entity of Chinese resident admit notes (RANs), we propose a medical entity recognition model based on character and word attention-enhanced (CWAE) neural network

  • Different from the methods mentioned above, we proposed a medical entity recognition model based on character and word attention-enhanced neural network for Chinese RANs

Read more

Summary

Introduction

An electronic medical record (EMR) is a textual record of medical activities [1,2]. The development of information technology has promoted the growth of electronic medical records. Chinese RANs is unstructured, so it is highly significant to accurately extract the entity information. We should consider the words “腹壁 (abdominal wall)”, “静脉 (vein)”, and characters “腹”, “壁”, “静”, “脉” for Chinese entity recognition at the same time. For medical entity of Chinese RANs, we propose a medical entity recognition model based on character and word attention-enhanced (CWAE) neural network. We annotated nine types of medical entities on 355 RANs from a famous hospital in Hunan Province, China They included 医学发现 (medical discovery), 时间词 (temporal word), 检查 (inspection), 检验 (laboratory test), 治疗 (treatment), 疾. We annotated nine types of entities on Chinese RANs, including medical discovery, temporal word, inspection, laboratory test, treatment, disease, drug, body part, and measurement.

Related Work
Entity Annotation
Annotation
Consistency Check
The entity consistency statistics arewe shown in Figure
Methods
Result
Figure 5 is the
Word Embedding
Character-Based
BI-LSTM Layer
CRF Layer
Training Detail
Evaluation Metrics
Result of Our Models
Comparison with Traditional Models
Comparison of Overall
Comparison of Each Category
Comparison with Deep Learning Models
Comparison
Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.