IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training.

Che Liu,Sibo Cheng,Miaojing Shi,Anand Shah,Wenjia Bai,Rossella Arcucci

doi:10.1109/tmi.2024.3449690

Abstract

In the field of medical Vision-Language Pretraining (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated medical images. However, most existing methods may have overlooked the opportunity in leveraging the inherent hierarchical structure of clinical reports, which are generally split into 'findings' for descriptive content and 'impressions' for conclusive observation. Instead of utilizing this rich, structured format, current medical VLP approaches often simplify the report into either a unified entity or fragmented tokens. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report. Furthermore, a new clinical-informed contrastive loss is introduced for cross-modal learning, which accounts for clinical prior knowledge in formulating sample correlations in contrastive learning. The proposed model, IMITATE, outperforms baseline VLP methods across six different datasets, spanning five medical imaging downstream tasks. Comprehensive experimental results highlight the advantages of integrating the hierarchical structure of medical reports for vision-language alignment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on medical imaging

Lead the way for us

Journal: IEEE transactions on medical imaging	Publication Date: Jan 1, 2024
Citations: 2

Similar Papers

VMEKNet: Visual Memory and External Knowledge Based Network for Medical Report Generation
Weipeng Chen ... Haiwei Pan
-
Weipeng Chen, et. al.Weipeng Chen ... Haiwei Pan
01 Jan 2021
01 Jan 2021

Automatic Report Generation for Chest X-Ray Images: A Multilevel Multi-attention Approach
Gaurav O Gajbhiye ... Abhijeet V Nandedkar
-
Gaurav O Gajbhiye, et. al.Gaurav O Gajbhiye ... Abhijeet V Nandedkar
01 Jan 2020
01 Jan 2020

COVID-19 detection in CT and CXR images using deep learning models.
Ines Chouat ... Mohamed Ghorbel
Biogerontology | VOL. 23
Ines Chouat, et. al.Ines Chouat ... Mohamed Ghorbel
22 Jan 2022
Biogerontology | VOL. 23

Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images.
Kuniki Imagawa ... Kohei Shiomoto
Journal of imaging informatics in medicine | VOL. 37
Kuniki Imagawa, et. al.Kuniki Imagawa ... Kohei Shiomoto
08 Mar 2024
Journal of imaging informatics in medicine | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on medical imaging