Abstract
Multimodal neuroimaging data, including magnetic resonance imaging (MRI) and positron emission tomography (PET), provides complementary information about the brain that can aid in Alzheimer's disease (AD) diagnosis. However, most existing deep learning methods still rely on patch-based extraction from neuroimaging data, which typically yields suboptimal performance due to its isolation from the subsequent network and does not effectively capture the varying scales of structural changes in the cerebrum. Moreover, these methods often simply concatenate multimodal data, ignoring the interactions between them that can highlight discriminative regions and thereby improve the diagnosis of AD. To tackle these issues, we develop a multimodal and multi-scale deep learning model that effectively leverages the interaction between the multimodal and multiscale of the neuroimaging data. First, we employ a convolutional neural network to embed each scale of the multimodal images. Second, we propose multimodal scale fusion mechanisms that utilize both multi-head self-attention and multi-head cross-attention, which capture global relations among the embedded features and weigh each modality's contribution to another, and hence enhancing feature extraction and interaction between each scale of MRI and PET images. Third, we introduce a cross-modality fusion module that includes a multi-head cross-attention to fuse MRI and PET data at different scales and promote global features from the previous attention layers. Finally, all the features from every scale are fused to discriminate between the different stages of AD. We evaluated our proposed method on the ADNI dataset, and the results show that our model achieves better performance than the state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have