Abstract

Predicting the gene mutation status in whole slide images (WSI) is crucial for the clinical treatment, cancer management, and research of gliomas. With advancements in CNN and Transformer algorithms, several promising models have been proposed. However, existing studies have paid little attention on fusing multi-magnification information, and the model requires processing all patches from a whole slide image. In this paper, we propose a cross-magnification attention model called CroMAM for predicting the genetic status and survival of gliomas. The CroMAM first utilizes a systematic patch extraction module to sample a subset of representative patches for downstream analysis. Next, the CroMAM applies Swin Transformer to extract local and global features from patches at different magnifications, followed by acquiring high-level features and dependencies among single-magnification patches through the application of a Vision Transformer. Subsequently, the CroMAM exchanges the integrated feature representations of different magnifications and encourage the integrated feature representations to learn the discriminative information from other magnification. Additionally, we design a cross-magnification attention analysis method to examine the effect of cross-magnification attention quantitatively and qualitatively which increases the model's explainability. To validate the performance of the model, we compare the proposed model with other multi-magnification feature fusion models on three tasks in two datasets. Extensive experiments demonstrate that the proposed model achieves state-of-the-art performance in predicting the genetic status and survival of gliomas. The implementation of the CroMAM will be publicly available upon the acceptance of this manuscript at https://github.com/GuoJisen/CroMAM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call