Abstract

Multi-modal learning using unpaired labeled data from multiple modalities to boost the performance of deep learning models on each individual modality has attracted a lot of interest in medical image segmentation recently. However, existing unpaired multi-modal learning methods require a considerable amount of labeled data from both modalities to obtain satisfying segmentation results which are not easy to obtain in reality. In this paper, we investigate the use of unlabeled data for label-efficient unpaired multi-modal learning, with a focus on the scenario when labeled data is scarce and unlabeled data is abundant. We term this new problem as Semi-Supervised Unpaired Multi-Modal Learning and thereupon, propose a novel deep co-training framework. Specifically, our framework consists of two segmentation networks, where we train one of them for each modality. Unlabeled data is effectively applied to learn two image translation networks for translating images across modalities. Thus, labeled data from one modality is employed for the training of the segmentation network in the other modality after image translation. To prevent overfitting under the label scarce scenario, we introduce a new semantic consistency loss to regularize the predictions of an image and its translation from the two segmentation networks to be semantically consistent. We further design a novel class-balanced deep co-training scheme to effectively leverage the valuable complementary information from both modalities to boost the segmentation performance. We verify the effectiveness of our framework with two medical image segmentation tasks and our framework outperforms existing methods significantly.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call