Abstract
BackgroundElectronic medical records contain a variety of valuable medical information for patients. So, when we are able to recognize and extract risk factors for disease from EMRs of patients with cardiovascular disease (CVD), and are able to use them to predict CVD, we have the ability to automatically process clinical texts, resulting in an improved accuracy of supporting doctors for the clinical diagnosis of CVD. In the case where CVD is becoming more worldwide, predictive CVD based on EMRs has been studied by many researchers to address this important aspect of improving diagnostic efficiency.MethodsThis paper proposes an Enhanced Character-level Deep Convolutional Neural Networks (EnDCNN) model for cardiovascular disease prediction.ResultsOn the manually annotated Chinese EMRs corpus, our risk factor identification extraction model achieved 0.9073 of F-score, our prediction model achieved 0.9516 of F-score, and the prediction result is better than the most previous methods.ConclusionsThe character-level model based on text region embedding can well map risk factors and their labels as a unit into a vector, and downsampling plays a crucial role in improving the training efficiency of deep CNN. What’s more, the shortcut connections with pre-activation used in our model architecture implements dimension-matching free in training.
Highlights
Electronic medical records contain a variety of valuable medical information for patients
Experimental results We did a rich comparative experiment on our own model itself and other models: In Fig. 5, we performed a comparison of the Conditional Random Field (CRF) and Bi-directional Long Short-term Memory Networks (BiLSTM)-CRF models for the identification of risk factors in EMRs
We used CRF and BiLSTM-CRF algorithm to identify the risk of cardiovascular disease (CVD) and its corresponding risk factors
Summary
Electronic medical records contain a variety of valuable medical information for patients. A record of a complete medical record contains only 10 records of information that is effective in causing disease These excessively irrelevant information reduces CNN’s emphasis on effective disease information, and greatly delays the training time of neural networks. In this regard, we propose to extract the risk factors that cause disease in EMRs and bring along the Time Attribute of these risk factors. We propose to extract the risk factors and their corresponding labels recognition for the characteristics of the CNN network we use This method avoids a large amount of non-critical information, and reduces the time spent on model architecture training to some extent
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have