Classification of brain tumours in MR images using deep spatiospatial models

Soumick Chatterjee,Andreas Nürnberger,Oliver Speck,Faraz Ahmed Nizamani

doi:10.1038/s41598-022-05572-6

Abstract

A brain tumour is a mass or cluster of abnormal cells in the brain, which has the possibility of becoming life-threatening because of its ability to invade neighbouring tissues and also form metastases. An accurate diagnosis is essential for successful treatment planning, and magnetic resonance imaging is the principal imaging modality for diagnosing brain tumours and their extent. Deep Learning methods in computer vision applications have shown significant improvement in recent years, most of which can be credited to the fact that a sizeable amount of data is available to train models, and the improvements in the model architectures yield better approximations in a supervised setting. Classifying tumours using such deep learning methods has made significant progress with the availability of open datasets with reliable annotations. Typically those methods are either 3D models, which use 3D volumetric MRIs or even 2D models considering each slice separately. However, by treating one spatial dimension separately or by considering the slices as a sequence of images over time, spatiotemporal models can be employed as “spatiospatial” models for this task. These models have the capabilities of learning specific spatial and temporal relationships while reducing computational costs. This paper uses two spatiotemporal models, ResNet (2+1)D and ResNet Mixed Convolution, to classify different types of brain tumours. It was observed that both these models performed superior to the pure 3D convolutional model, ResNet18. Furthermore, it was also observed that pre-training the models on a different, even unrelated dataset before training them for the task of tumour classification improves the performance. Finally, Pre-trained ResNet Mixed Convolution was observed to be the best model in these experiments, achieving a macro F1-score of 0.9345 and a test accuracy of 96.98%, while at the same time being the model with the least computational cost.

Highlights

This paper shows that the spatiotemporal models, ResNet(2+1)D and ResNet Mixed Convolution, working as spatiospatial models, could improve the classification of grades of brain tumours, as well as classifying brain images with and without tumours, while reducing the computational costs
It was observed that the spatiospatial models performed better than a pure 3D convolutional ResNet[18] model, even though having fewer trainable parameters
The pre-trained ResNet Mixed Convolution model was observed to be the best model in terms of F1-score, obtaining a macro F1-score of 0.9345 and a mean test accuracy of 96.98%, while achieving 0.8949 and 0.9123 F1-scores for low-grade glioma and high-grade glioma, respectively

Summary

Methodology

This section explains the network models used in this research, implementation details, pre-training and training methods, data augmentation techniques, dataset information, data pre-processing steps, and the evaluation metrics. ResNet Mixed Convolution uses a combination of 2D and 3D Convolutions The stem of this model contains a 3D convolution layer with a kernel size of (3,7,7), a stride of (1,2,2), and padding of (1,3,3)—where the first dimension is the slice dimension and the other two dimensions are the in-plane dimensions, and accepts a single channel as input while providing 64 channels as output. There is one difference between the first convolutional block and the other three blocks (applicable for all three models): the second, third and fourth convolutional blocks included a downsampling pair, which consisted of a 3D convolutional layer with a kennel size of one and a stride of two, followed by a batch normalisation layer. A confusion matrix was used to show class-wise accuracy

Results

Discussion

Conclusion