Electric Health Records (EHR) have been widely adopted by many hospitals to improve clinical decision making and re-admission prediction. Each patient admission usually contains both multiple medical codes as well as clinical notes. Accurate learning of the representations (also called embeddings) of medical concepts from EHR is a key strategy to improve prediction performance in healthcare. Existing works employ medical ontologies to improve the quality of representations but focus solely on the relationships amongst the medical codes, ignoring textual data such medical code descriptions, clinical notes, and patient demographics. In this paper, we propose a new model called Semantic-based Attention model using Textual data for Medical Concept Embedding (SATexMCE). SATexMCE consists of three parts: medical codes embedding, admission embedding, and prediction model. First, we generate representations of medical codes by using both the textual description of medical codes and also the relationships among medical codes. Then, we generate the admission representation based on the representation of medical codes in the admission, the clinical notes and the demographic information associated with the admission via several attention mechanisms. The admission embeddings are used to construct a Recurrent neural network model which is used to predict patients’ readmission and a disease in the next admission based on patient admission data. Experimental results show that the proposed SATexMCE model improves not only the performance of readmission prediction but also the quality of medical concept representations. Attention mechanism helps us measure the importance of different medical codes and understand the meaning of the words in clinical notes for predictions.
Read full abstract