Uncertainty-aware incomplete multimodal fusion for few-shot Central Retinal Artery Occlusion classification

Qian Zhou,Ting Chen,Hua Zou,Xuan Xiao

doi:10.1016/j.inffus.2023.102200

Abstract

Central Retinal Artery Occlusion (CRAO) is a rare and severe ophthalmic disease that remains challenging to accurately diagnose and classify in clinical practice. The low incidence rate of CRAO makes it difficult to gather a large-scale dataset for training deep-learning models in CRAO classification. Even though the integration of multimodal information has shown the potential to enhance classification performance, the acquisition of complete multimodal data shows significant challenges. This is mainly due to limitations in medical resources and examination costs. Consequently, existing deep learning approaches are unable to learn discriminative features for CRAO classification. In this work, we propose a novel deep-learning methodology that takes advantage of multi-task learning and trustworthy fusion to improve the classification of CRAO with incomplete multimodal data. In the feature extraction stage, we design a multi-task framework that simultaneously performs classification and lesion segmentation tasks. The segmentation task helps the image encoder learn more disease-related features. In particular, text annotations of lesion regions are used to compute the similarity between text features and image features as an auxiliary loss to learn discriminative representations for fine-grained classification. In the multimodal fusion stage, we propose a trustworthy fusion strategy to learn effective joint representations from incomplete multimodal data, in which the model’s predictive uncertainty is used to adaptively weigh modality-specific features. We evaluate our method on a self-collected dataset and compare its performance with other state-of-the-art approaches. The results show the superiority of the proposed method with a mean accuracy of 90.31 %.

Full Text