Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer’s disease classification

Zhentao Hu,Yanyang Li,Zheng Wang,Shuo Zhang,Wei Hou

doi:10.1016/j.compbiomed.2023.107304

Zhentao Hu, Yanyang Li + Show 3 more

Open Access

PDF Available

https://doi.org/10.1016/j.compbiomed.2023.107304

Copy DOI

Export

Save

Cite

Journal: Computers in Biology and Medicine	Publication Date: Jul 31, 2023
Citations: 20	License type: publisher-specific-oa

Affiliation: Henan University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Deep learning (DL) algorithms based on brain MRI images have achieved great success in the prediction of Alzheimer’s disease (AD), with classification accuracy exceeding even that of the most experienced clinical experts. As a novel feature fusion method, Transformer has achieved excellent performance in many computer vision tasks, which also greatly promotes the application of Transformer in medical images. However, when Transformer is used for 3D MRI image feature fusion, existing DL models treat the input local features equally, which is inconsistent with the fact that adjacent voxels have stronger semantic connections than spatially distant voxels. In addition, due to the relatively small size of the dataset for medical images, it is difficult to capture local lesion features in limited iterative training by treating all input features equally. This paper proposes a deep learning model Conv-Swinformer that focuses on extracting and integrating local fine-grained features. Conv-Swinformer consists of a CNN module and a Transformer encoder module. The CNN module summarizes the planar features of the MRI slices, and the Transformer module establishes semantic connections in 3D space for these planar features. By introducing the shift window attention mechanism in the Transformer encoder, the attention is focused on a small spatial area of the MRI image, which effectively reduces unnecessary background semantic information and enables the model to capture local features more accurately. In addition, the layer-by-layer enlarged attention window can further integrate local fine-grained features, thus enhancing the model’s attention ability. Compared with DL algorithms that indiscriminately fuse local features of MRI images, Conv-Swinformer can fine-grained extract local lesion features, thus achieving better classification results.

Full Text