Abstract

Semantic segmentation of medical images with deep learning models is rapidly being developed. In this study, we benchmarked state-of-the-art deep learning segmentation algorithms on our clinical stereotactic radiosurgery dataset. The dataset consists of 1688 patients with various brain lesions (pituitary tumors, meningioma, schwannoma, brain metastases, arteriovenous malformation, and trigeminal neuralgia), and we divided the dataset into a training set (1557 patients) and test set (131 patients). This study demonstrates the strengths and weaknesses of deep-learning algorithms in a fairly practical scenario. We compared the model performances concerning their sampling method, model architecture, and the choice of loss functions, identifying suitable settings for their applications and shedding light on the possible improvements. Evidence from this study led us to conclude that deep learning could be promising in assisting the segmentation of brain lesions even if the training dataset was of high heterogeneity in lesion types and sizes.

Highlights

  • Stereotactic radiosurgery (SRS) is a treatment modality using ionizing radiation, focusing on precisely selected areas of tissue

  • When we experimented with the settings in the preliminary test, center patch sizes lower than 190 × 190 would result in tremendously low segmentation performance for V-Net in the NTUH dataset, so this sampling method was not used in the formal benchmark analysis

  • To compare the variables contributing to the performances of the models trained with the NTUH dataset, we experimented with the segmentation of the brain tumors in the BraTS dataset

Read more

Summary

Introduction

Stereotactic radiosurgery (SRS) is a treatment modality using ionizing radiation, focusing on precisely selected areas of tissue. Havaei et al (2017) proposed the idea of using a deep learning model to perform brain tumor segmentation tasks on MRI images [12]. They pointed out that both local and global representations are essential to produce better results, and this intuition was later realized in various ways. Sci. 2021, 11, 9180 previously-mentioned studies, small sample sizes were important contributors to the lack of confidence to infer the generalization of deep-learning models in clinical practices with heterogeneous lesion types. We used the BRATS dataset to evaluate whether our implementations of deep learning models were correct and comparable to their original implementations

Materials and Methods
Preprocessing
Data Augmentation
Deep Learning Models
Uniform Patch
Center Patch
Hyperparameters
Loss Functions
Weighted Cross-Entropy
Soft-Dice
Precision and Sensitivity
Performances of Models on Segmentation of Brain Lesions in NTUH Dataset
Performances of Models on Segmentation of Brain Tumors in BraTS Dataset
Segmentation Performance
Performance on Different Types of Tumor
Comparison between Deep Learning Models
Limitation of This Study
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call