Abstract

Cross-modality data translation has attracted great interest in medical image computing. Deep generative models show performance improvement in addressing related challenges. Nevertheless, as a fundamental challenge in image translation, the problem of zero-shot learning cross-modality image translation with fidelity remains unanswered. To bridge this gap, we propose a novel unsupervised zero-shot learning method called Mutual Information guided Diffusion Model, which learns to translate an unseen source image to the target modality by leveraging the inherent statistical consistency of Mutual Information between different modalities. To overcome the prohibitive high dimensional Mutual Information calculation, we propose a differentiable local-wise mutual information layer for conditioning the iterative denoising process. The Local-wise-Mutual-Information-Layer captures identical cross-modality features in the statistical domain, offering diffusion guidance without relying on direct mappings between the source and target domains. This advantage allows our method to adapt to changing source domains without the need for retraining, making it highly practical when sufficient labeled source domain data is not available. We demonstrate the superior performance of MIDiffusion in zero-shot cross-modality translation tasks through empirical comparisons with other generative models, including adversarial-based and diffusion-based models. Finally, we showcase the real-world application of MIDiffusion in 3D zero-shot learning-based cross-modality image segmentation tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call