Abstract
BackgroundBenefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has not only changed the format of medical records but also helped users to obtain information faster. However, there are many challenges regarding researching directly using Chinese EMRs, such as low quality, huge quantity, imbalance, semi-structure and non-structure, particularly the high density of the Chinese language compared with English. Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs.ResultsIn this paper, we propose a deep learning framework to study intelligent diagnosis using Chinese EMR data, which incorporates a convolutional neural network (CNN) into an EMR classification application. The novelty of this paper is reflected in the following: (1) We construct a pediatric medical dictionary based on Chinese EMRs. (2) Word2vec adopted in word embedding is used to achieve the semantic description of the content of Chinese EMRs. (3) A fine-tuning CNN model is constructed to feed the pediatric diagnosis with Chinese EMR data. Our results on real-world pediatric Chinese EMRs demonstrate that the average accuracy and F1-score of the CNN models are up to 81%, which indicates the effectiveness of the CNN model for the classification of EMRs. Particularly, a fine-tuning one-layer CNN performs best among all CNNs, recurrent neural network (RNN) (long short-term memory, gated recurrent unit) and CNN-RNN models, and the average accuracy and F1-score are both up to 83%.ConclusionThe CNN framework that includes word segmentation, word embedding and model training can serve as an intelligent auxiliary diagnosis tool for pediatricians. Particularly, a fine-tuning one-layer CNN performs well, which indicates that word order does not appear to have a useful effect on our Chinese EMRs.
Highlights
Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, the combination of natural language processing (NLP) and deep neural networks
We study the effectiveness of our proposed framework on real-world pediatric Chinese electronic medical records (EMRs) data
Considering the advantage of convolutional neural network (CNN) in local feature extraction and modeling performance, we attempted to explore a framework based on a CNN model for intelligent diagnosis with pediatric Chinese EMRs
Summary
Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, the combination of natural language processing (NLP) and deep neural networks. Effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs. Challenges of diagnosing using EMR data An integrated electronic medical record system is becoming an essential part of the fabric of modern healthcare, which can collect, store, display, transmit and reproduce patient information [1, 2]. Due to their semi-structured and unstructured form, the study of EMRs belongs to the specific domain of Natural Language Processing (NLP). Ananthakrishnan et al [6] developed a robust electronic medical record–based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. Some effective NLP methods have been proposed for EMRs, lots of challenges still remain, to list a few among the most relevant ones:
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.