Transformer-based cross-modal multi-contrast network for ophthalmic diseases diagnosis

Yang Yu,Hongqing Zhu

doi:10.1016/j.bbe.2023.06.001

Abstract

Automatic diagnosis of various ophthalmic diseases from ocular medical images is vital to support clinical decisions. Most current methods employ a single imaging modality, especially 2D fundus images. Considering that the diagnosis of ophthalmic diseases can greatly benefit from multiple imaging modalities, this paper further improves the accuracy of diagnosis by effectively utilizing cross-modal data. In this paper, we propose Transformer-based cross-modal multi-contrast network for efficiently fusing color fundus photograph (CFP) and optical coherence tomography (OCT) modality to diagnose ophthalmic diseases. We design multi-contrast learning strategy to extract discriminate features from cross-modal data for diagnosis. Then channel fusion head captures the semantically shared information across different modalities and the similarity features between patients of the same category. Meanwhile, we use a class-balanced training strategy to cope with the situation that medical datasets are usually class-imbalanced. Our method is evaluated on public benchmark datasets for cross-modal ophthalmic disease diagnosis. The experimental results demonstrate that our method outperforms other approaches. The codes and models are available at https://github.com/ecustyy/tcmn.

Full Text