Abstract
The increasing use of electronic health records (EHRs) generates a vast amount of data, which can be leveraged for predictive modeling and improving patient outcomes. However, EHR data are typically mixtures of structured and unstructured data, which presents two major challenges. While several studies have focused on using machine learning models to predict patient outcomes, these models often require data to be in a structured format, which may lead to the loss of important information. On the other hand, unstructured data, such as narrative reports, can be noisy and challenging for natural language processing applications and interoperability. Therefore, there is a need to bridge the gap between structured EHR data and NLP-based predictive models. In this paper, we propose a fuzzy-logic-based pipeline that generates medical narratives from structured EHR data and evaluates its performance in predicting patient outcomes. The pipeline includes a feature selection operation and a reasoning and inference function that generates medical narratives. We then extensively evaluate the generated narratives using transformer-based NLP models for a patient-outcome-prediction task. We furthermore assess the interpretability of the generated text using Shapley values. Our approach has demonstrated comparable performance to the benchmark baseline models with an F1-score of 93.7%, while exhibiting slightly improved results in terms of recall. The model demonstrated proficiency in the preservation of information and interpretability inherited from nuanced and structured narratives. To the best of our knowledge, this is the first study to demonstrate the ability to transform tabular data into text to apply NLP for a prediction task.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have