Abstract

BackgroundElectronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effectively use by traditional machine learning methods while the sequential information of EHRs is very useful.MethodIn this paper, we propose a general-purpose patient representation learning approach to summarize sequential EHRs. Specifically, a recurrent neural network based denoising autoencoder (RNN-DAE) is employed to encode inhospital records of each patient into a low dimensional dense vector.ResultsBased on EHR data collected from Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine, we experimentally evaluate our proposed RNN-DAE method on both mortality prediction task and comorbidity prediction task. Extensive experimental results show that our proposed RNN-DAE method outperforms existing methods. In addition, we apply the “Deep Feature” represented by our proposed RNN-DAE method to track similar patients with t-SNE, which also achieves some interesting observations.ConclusionWe propose an effective unsupervised RNN-DAE method to summarize patient sequential information in EHR data. Our proposed RNN-DAE method is useful on both mortality prediction task and comorbidity prediction task.

Highlights

  • Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research

  • Based on EHR data collected from Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine, we experimentally evaluate our proposed recurrent neural network based denoising autoencoder (RNN-DAE) method on both mortality prediction task and comorbidity prediction task

  • Extensive experimental results show that our proposed RNN-DAE method outperforms existing methods

Read more

Summary

Introduction

Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. For. Ruan et al BMC Medical Informatics and Decision Making 2019, 19(Suppl 8):259 instance, Mikolov et al [6] applied neural language models to learn a distributed representation for each word, called a word embedding. Ruan et al BMC Medical Informatics and Decision Making 2019, 19(Suppl 8):259 instance, Mikolov et al [6] applied neural language models to learn a distributed representation for each word, called a word embedding They further proposed an unsupervised algorithm [7] to learn fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Devlin et al [9] proposed a language representation model called bidirectional encoder representations from transformers to generate word embeddings Those representations perform effectively results on multiple natural language processing tasks, such as question answering and language inference. Compared to the vectors generated by these methods, those derived by representation learning models are lowdimensional and dense, and they capture the semantics in context

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call