Abstract

Abstract PURPOSE To compare the performance and explainability of the visual transformer (ViT) and convolutional neural network (CNN) architectures in predicting genomic mutations from brain MRI. METHODS The performances of the ViT and CNN classification models in predicting the IDH mutation status of gliomas were compared. The two models were fine-tuned on the TCIA dataset. The fine-tuned models were evaluated on the TCIA dataset and an external independent dataset, namely the Japanese Cohort (JC) dataset. To evaluate their explanatory power, the gradient-weighted class activation mapping (Grad-CAM) visualization of the CNNs model and attention map visualization of the ViT model were compared. RESULTS The visual transformer model consistently outperforms the convolutional neural network on both the TCIA and JC datasets (p-value = 0.021, p-value < 0.001, statistically different). The attention map of the ViT model accurately highlighted the tumor, and the Grad-CAM of the CNN model sometimes highlighted non-tumor areas. CONCLUSION The ViT model was more robust against differences in the image domain. The ViT model's attention map had superior explainability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.