Abstract
Background Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks. Objective The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge. Methods The proposed Multimodal Similarity-aware and Knowledge-driven Pre-Training (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability. Results We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%. Conclusions MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have