Abstract

The emergence and widespread adoption of electronic health records (EHR) in modern healthcare systems has generated a large amount of data with the potential to significantly improve patient outcomes. However, extracting meaningful insights from EHR is challenging due to their extensive volume and inherent complexity. Leveraging diverse modalities of data enhances the exploration of patient data representation. Nevertheless, current research on multi-modal relationships between concealed graph structures within medical codes and unstructured text overlooks inherent disparities and inconsistencies. To address the significant gap, this study introduces a pretraining approach named graph and text multi-modal representation learning with momentum distillation on Electronic Health Records (GTMMRL). This approach tackles the challenge of noisy and unreliable labels using a large open-source EHR dataset, MIMIC-III, for pretraining. Five carefully selected proxy tasks are employed, each guiding the student model’s learning through pseudo-targets generated by the teacher model, which is an exponential moving average of the student model, thereby mitigating overfitting tendencies. We evaluate the performance of our model in five downstream tasks, each addressing real-world challenges in EHR data analysis. Results showcase that our method achieves state-of-the-art performance in various downstream tasks, highlighting the model’s critical role in enhancing the quality of insights extracted from complex EHR data. The code is available at https://github.com/deepEMR/GTMMRL.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.