Objective: Alzheimer's disease (AD) is an irreversible neurodegenerative disease, while mild cognitive impairment (MCI) is a clinical precursor of AD, thus differentiation of AD, MCI and normal control (NC) from noninvasive magnetic resonance imaging (MRI) has positive clinical implications. Material and method: We utilize a 3D residual network to classify AD, MCI, and NC, and add a multiscale module to the original network to enhance the feature representation capability of the network, as well as a cross-dimensional attentional mechanism to enhance the network's attention to important brain regions. We experimentally verified that the network is more inclined to overestimate the brain age of patients in AD and MCI subgroups, thus proving that there is a high correlation between the brain age prediction task and the AD classification task. Therefore, we adopted a multi-task learning approach, using brain age prediction as a supplementary task for AD classification to reduce the risk of overfitting of the network during the training process. Results: Our method achieved 96.02% accuracy, 93.40% precision, 91.48% recall, and 92.24% F1 value in AD/MCI/NC classification. Conclusions: Ablation experiments confirmed that our proposed cross-dimensional attention and multiscale modules can improve the diagnostic performance of AD and MCI, and that multi-task learning in conjunction with brain age prediction can further improve the performance.