Identifying heart disease risk factors from electronic health records using an ensemble of deep learning method

Linkai Luo,Yue Wang,Daniel Y Mo

doi:10.1080/24725579.2023.2205665

Abstract

Heart disease is a leading cause of death worldwide. For decades, cardiologists have attempted to identify heart-disease risk factors to facilitate its prediction, prevention, and treatment. In recent years, electronic health records (EHRs) have become a valuable source for detecting these risk factors (e.g. smoking, obesity, and diabetes). However, challenges persist as EHRs include clinical notes in free-form and unstructured text, making it tedious for cardiologists to retrieve relevant information. To resolve this problem, we devised a deep-learning-based ensemble approach to automatically identify heart-disease risk factors from EHRs. This proposed approach can efficiently extract semantic information from EHRs and automate risk-factor identification with high performance. In particular, this approach does not require any external domain knowledge about the disease because a powerful Bidirectional Encoder Representations from Transformers (BERT) method is implemented to encode the discriminative features of clinical notes. The extracted features are then fed to conditional random fields (CRF) to identify all possible risk-factor indicators. Experimental results show that, in a scenario where no external knowledge is available, the proposed approach achieves state-of-the-art performance.

Full Text