Multimodal Transformer of Incomplete MRI Data for Brain Tumor Segmentation.
Accurate segmentation of brain tumors plays an important role for clinical diagnosis and treatment. Multimodal magnetic resonance imaging (MRI) can provide rich and complementary information for accurate brain tumor segmentation. However, some modalities may be absent in clinical practice. It is still challenging to integrate the incomplete multimodal MRI data for accurate segmentation of brain tumors. In this paper, we propose a brain tumor segmentation method based on multimodal transformer network with incomplete multimodal MRI data. The network is based on U-Net architecture consisting of modality specific encoders, multimodal transformer and multimodal shared-weight decoder. First, a convolutional encoder is built to extract the specific features of each modality. Then, a multimodal transformer is proposed to model the correlations of multimodal features and learn the features of missing modalities. Finally, a multimodal shared-weight decoder is proposed to progressively aggregate the multimodal and multi-level features with spatial and channel self-attention modules for brain tumor segmentation. A missing-full complementary learning strategy is used to explore the latent correlation between the missing and full modalities for feature compensation. For evaluation, our method is tested on the multimodal MRI data from BraTS 2018, BraTS 2019 and BraTS 2020 datasets. The extensive results demonstrate that our method outperforms the state-of-the-art methods for brain tumor segmentation on most subsets of missing modalities.
- Research Article
- 10.1109/jbhi.2025.3600652
- Mar 1, 2026
- IEEE journal of biomedical and health informatics
The accurate segmentation of brain tumors plays an important role in clinical diagnosis and treatment. Multimodal magnetic resonance imaging (MRI) can provide rich and complementary information for accurate brain tumor segmentation. However, the common problems of incomplete modalities and small samples in clinical practice seriously affect the performance of multimodal segmentation. In this work, we design a new framework, named M$^{2}$SegMamba, using Mamba and Masked Autoencoder networks for both supervised and self-supervised learning, aimed at handling small sample brain tumor segmentation under various incomplete multimodality settings. We construct a masking strategy suitable for multimodal brain tumors to precisely extract image features, which serves as the foundation for image segmentation. By fully leveraging the capabilities of the Mamba network, we design a multi-traversal method to facilitate the interaction between inter-modal and cross-modal image features. Meanwhile, the introduction of TSmamba in skipping connections efficiently integrates multimodal features. Auxiliary regularizers are introduced in both the encoder and decoder to further enhance the model's robustness to incomplete modalities. We conducted experiments on the BraTS 2018 and BraTS 2020 datasets, and the results demonstrate that our method outperforms state-of-the-art brain tumor segmentation methods on most subsets of missing modalities.
- Research Article
2
- 10.56286/ntujet.v2i3.692
- Nov 20, 2023
- NTU Journal of Engineering and Technology
Detecting and quantifying the extent of brain tumors poses a formidable challenge in medical centers. Magnetic Resonance Imaging (MRI) has developed as a non-invasive brain cancers' primary diagnostic tool, offering the crucial advantage of avoiding ionizing radiation. Brain tumor manually segmented boundaries within 3D MRI volumes is an exceedingly time-intensive task, heavily reliant on operator expertise. Among brain tumors, gliomas stand out as the prevalent and highly malignant, significantly impacting patients' life expectancy, particularly at their highest grade. Recognizing the pressing need for a reliable, completely automatic segmentation technique to efficiently assess tumor extent, this study introduces a robust approach. A completely automated brain tumor segmentation method is proposed, leveraging U-Net-based deep convolutional networks. This approach underwent rigorous evaluation on the Multimodal Brain Tumor Image Segmentation BraTS-19 dataset a widely recognized medical image analysis dataset featuring multimodal MRI scans of brain tumors, including glioblastoma, anaplastic astrocytoma, and lower-grade glioma, coupled with corresponding manual tumor segmentations. This dataset serves as a pivotal resource for advancing automatic brain tumor segmentation techniques and assessing their performance using metrics like the Dice score, which achieved 92% for entire tumor. Cross-validation results affirm the efficiency and promise of our method in achieving accurate segmentation.
- Research Article
- 10.1002/ima.70056
- Mar 1, 2025
- International Journal of Imaging Systems and Technology
Brain tumors, particularly gliomas, pose a significant global health challenge, causing numerous fatalities annually. Among gliomas, glioblastoma stands out as a highly aggressive type, often resulting in severe symptoms. Accurate segmentation of brain tumors from multimodal magnetic resonance imaging (MRI) data is crucial for effective diagnosis and treatment planning. This study introduces a novel 3D U‐Net semantic segmentation model with a modified data generator approach, specifically tailored for the brain tumor segmentation (BraTS) 2020 dataset. The modified data generator is unique in that it performs on‐the‐fly data augmentation, generating diverse and distinct data samples during training. This approach reduces overfitting and enhances generalization, which is critical for handling the variability of brain tumor presentations. The model was trained end‐to‐end without weight transfer, optimizing the dice score as the primary evaluation metric. The proposed model achieved dice scores of 82.2%, 90.3%, and 77.8% for tumor core, whole tumor, and enhancing tumor regions, respectively, on the BraTS 2020 validation dataset. The minimal variation from training data underscores the model's robustness and reliability in segmenting different tumor subtypes. The modified data generator approach presents a promising advancement for brain tumor segmentation, with the potential for significant improvements in treatment planning and patient outcomes. This model could support more accurate and robust segmentation in clinical applications by effectively addressing data variability.
- Research Article
5
- 10.23977/acss.2023.070803
- Sep 1, 2023
- Advances in Computer, Signals and Systems
The application of deep learning in the field of medical imaging has become increasingly widespread, greatly promoting the advancement and development of Magnetic Resonance Imaging (MRI) brain tumor detection and segmentation techniques. Therefore, a comprehensive review of deep learning-based methods for MRI brain tumor detection and segmentation was conducted. This review introduces the basic concepts of brain tumors and MRI brain tumor detection and segmentation, discusses the specific applications and typical methods of deep learning in MRI brain tumor detection and segmentation, and analyzes and compares the performance and advantages and disadvantages of different methods. Additionally, representative brain tu-mor segmentation dataset (BraTS) and its evaluation metrics are introduced, upon which the performance of various deep learning-based brain tumor segmentation methods on the BraTS 2019-2022 dataset is compared. Lastly, the challenges and future development trends in deep learning-based MRI brain tumor detection and segmentation methods are summarized and anticipated.
- Book Chapter
176
- 10.1007/978-3-031-16443-9_11
- Jan 1, 2022
Accurate brain tumor segmentation from Magnetic Resonance Imaging (MRI) is desirable to joint learning of multimodal images. However, in clinical practice, it is not always possible to acquire a complete set of MRIs, and the problem of missing modalities causes severe performance degradation in existing multimodal segmentation methods. In this work, we present the first attempt to exploit the Transformer for multimodal brain tumor segmentation that is robust to any combinatorial subset of available modalities. Concretely, we propose a novel multimodal Medical Transformer (mmFormer) for incomplete multimodal learning with three main components: the hybrid modality-specific encoders that bridge a convolutional encoder and an intra-modal Transformer for both local and global context modeling within each modality; an inter-modal Transformer to build and align the long-range correlations across modalities for modality-invariant features with global semantics corresponding to tumor region; a decoder that performs a progressive up-sampling and fusion with the modality-invariant features to generate robust segmentation. Besides, auxiliary regularizers are introduced in both encoder and decoder to further enhance the model’s robustness to incomplete modalities. We conduct extensive experiments on the public BraTS 2018 dataset for brain tumor segmentation. The results demonstrate that the proposed mmFormer outperforms the state-of-the-art methods for incomplete multimodal brain tumor segmentation on almost all subsets of incomplete modalities, especially by an average 19.07% improvement of Dice on tumor segmentation with only one available modality. The code is available at https://github.com/YaoZhang93/mmFormer .
- Research Article
2
- 10.1002/mp.17845
- Apr 28, 2025
- Medical physics
The main task of deep learning (DL) based brain tumor segmentation is to get accurate projection from learned image features to their corresponding semantic labels (i.e., brain tumor sub-regions). To achieve this goal, segmentation networks are required to learn image features with high intra-class consistency. However, brain tumor are known to be heterogeneous, and it often causes high diversity in image gray values which further influences the learned image features. Therefore, projecting such diverse image features (i.e., low intra-class consistency) to the same semantic label is often difficult and inefficient. The purpose of this study is to address the issue of low intra-class consistency of image features learned from heterogeneous brain tumor regions and ease the projection of image features to their corresponding semantic labels. In this way, accurate segmentation of brain tumor can beachieved. We propose a new DL-based method for brain tumor segmentation, where a semantic feature module (SFM) is introduced to consolidate image features with meaningful semantic information and enhance their intra-class consistency. Specifically, in the SFM, deep semantic vectors are derived and used as prototypes to re-encode image features learned in the segmentation network. Since the relatively consistent deep semantic vectors, diversity of the resulting image features can be reduced; moreover, semantic information in the resulting image features can also be enriched, both facilitating accurate projection to the final semanticlabels. In the experiment, a public brain tumor dataset, BraTS2022 containing, multi-sequence MR images of 1251 patients is used to evaluate our method in the task of brain tumor sub-region segmentation, and the experimental results demonstrate that, benefiting from the SFM, our method outperforms the state-of-the-art methods with statistical significance ( using the Wilcoxon signed rank test). Further ablation study shows that the proposed SFM can yield an improvement in segmentation accuracy (Dice index) of up to 11% comparing with that without the SFM. In DL-based segmentation, low intra-class consistency of learned image features degrades segmentation performance. The proposed SFM can effectively enhance the intra-class consistency with high-level semantic information, making the projection of image features to their corresponding semantic labels moreaccurate.
- Research Article
11
- 10.1016/j.compbiomed.2024.108799
- Jun 25, 2024
- Computers in Biology and Medicine
Comprehensive benchmarking of CNN-based tumor segmentation methods using multimodal MRI data
- Research Article
15
- 10.1016/j.compmedimag.2024.102332
- Jan 11, 2024
- Computerized Medical Imaging and Graphics
Accurate brain tumor segmentation is critical for diagnosis and treatment planning, whereby multi-modal magnetic resonance imaging (MRI) is typically used for analysis. However, obtaining all required sequences and expertly labeled data for training is challenging and can result in decreased quality of segmentation models developed through automated algorithms.In this work, we examine the possibility of employing a conditional generative adversarial network (GAN) approach for synthesizing multi-modal images to train deep learning-based neural networks aimed at high-grade glioma (HGG) segmentation. The proposed GAN is conditioned on auxiliary brain tissue and tumor segmentation masks, allowing us to attain better accuracy and control of tissue appearance during synthesis. To reduce the domain shift between synthetic and real MR images, we additionally adapt the low-frequency Fourier space components of synthetic data, reflecting the style of the image, to those of real data. We demonstrate the impact of Fourier domain adaptation (FDA) on the training of 3D segmentation networks and attain significant improvements in both the segmentation performance and prediction confidence. Similar outcomes are seen when such data is used as a training augmentation alongside the available real images. In fact, experiments on the BraTS2020 dataset reveal that models trained solely with synthetic data exhibit an improvement of up to 4% in Dice score when using FDA, while training with both real and FDA-processed synthetic data through augmentation results in an improvement of up to 5% in Dice compared to using real data alone. This study highlights the importance of considering image frequency in generative approaches for medical image synthesis and offers a promising approach to address data scarcity in medical imaging segmentation.
- Research Article
25
- 10.1016/j.knosys.2024.111854
- May 9, 2024
- Knowledge-Based Systems
Multi-teacher cross-modal distillation with cooperative deep supervision fusion learning for unimodal segmentation
- Research Article
58
- 10.1016/j.metrad.2023.100004
- Jun 1, 2023
- Meta-Radiology
Vision transformers in multi-modal brain tumor MRI segmentation: A review
- Research Article
79
- 10.3390/sym12081256
- Jul 29, 2020
- Symmetry
Accurate brain tumor segmentation from 3D Magnetic Resonance Imaging (3D-MRI) is an important method for obtaining information required for diagnosis and disease therapy planning. Variation in the brain tumor’s size, structure, and form is one of the main challenges in tumor segmentation, and selecting the initial contour plays a significant role in reducing the segmentation error and the number of iterations in the level set method. To overcome this issue, this paper suggests a two-step dragonfly algorithm (DA) clustering technique to extract initial contour points accurately. The brain is extracted from the head in the preprocessing step, then tumor edges are extracted using the two-step DA, and these extracted edges are used as an initial contour for the MRI sequence. Lastly, the tumor region is extracted from all volume slices using a level set segmentation method. The results of applying the proposed technique on 3D-MRI images from the multimodal brain tumor segmentation challenge (BRATS) 2017 dataset show that the proposed method for brain tumor segmentation is comparable to the state-of-the-art methods.
- Research Article
5
- 10.1109/access.2025.3571464
- Jan 1, 2025
- IEEE Access
Accurate brain tumor segmentation in MRI scans is crucial for effective diagnosis and treatment planning, as even minor segmentation errors can lead to significant clinical concerns. However, this task is challenged by the complex anatomy of the brain, variable tumor shapes, and low contrast between tumor sub-regions. Moreover, existing methods often lack generalizability across diverse segmentation tasks and imaging modalities. Large Vision Model (LVM) like the Segment Anything Model (SAM) represents a foundational advancement in tumor segmentation. While SAM has shown impressive performance on natural images, its effectiveness in medical imaging, especially brain tumor segmentation, is limited due to domain differences and indistinct boundaries in MRI scans. To address these challenges, we propose Eff-SAM, a 3D brain tumor segmentation framework that adapts SAM for medical applications through Parameter-Efficient Fine-Tuning (PEFT) technique. The framework incorporates PEFT with adapters into SAM’s encoder to fine-tune the model effectively and optimize its performance for medical imaging while maintaining computational efficiency. Additionally, a Cross-Sliced Attention (CSA) mechanism captures semantic relationships, improves tumor localization, and is introduced into the encoder. Robust data preprocessing further enhances the model’s generalization across datasets. Eff-SAM demonstrates state-of-the-art performance and outperforms benchmark methods on the BraTS 2020 and BraTS 2021 datasets. It achieves Dice scores of 0.884 for Whole Tumor (WT), 0.853 for Tumor Core (TC), and 0.818 for Enhancing Tumor (ET) on the BraTS 2020 dataset, and 0.880 (WT), 0.861 (TC), and 0.821 (ET) on the BraTS 2021 dataset. This work highlights the potential of integrating vision models like SAM with lightweight, domain-specific modules to deliver accurate and efficient brain tumor segmentation offering a clinically valuable tool, especially in scenarios with limited data and complex tumor morphology.
- Conference Article
16
- 10.1109/ner49283.2021.9441286
- May 4, 2021
Non-invasive techniques such as magnetic resonance imaging (MRI) are widely employed in brain tumor diagnostics. However, manual segmentation of brain tumors from 3D MRI volumes is a time-consuming task that requires trained expert radiologists. Due to the subjectivity of manual segmentation, there is low inter-rater reliability which can result in diagnostic discrepancies. As the success of many brain tumor treatments depends on early intervention, early detection is paramount. In this context, a fully automated segmentation method for brain tumor segmentation is necessary as an efficient and reliable method for brain tumor detection and quantification. In this study, we propose an end-to-end approach for brain tumor segmentation, capitalizing on a modified version of QuickNAT, a brain tissue type segmentation deep convolutional neural network (CNN). Our method was evaluated on a data set of 233 patient's T1 weighted images containing three tumor type classes annotated (meningioma, glioma, and pituitary). Our model, QuickTumorNet, demonstrated fast, reliable, and accurate brain tumor segmentation that can be utilized to assist clinicians in diagnosis and treatment.
- Research Article
35
- 10.1109/tmi.2023.3301934
- Dec 1, 2023
- IEEE transactions on medical imaging
Accurate segmentation of brain tumors is of critical importance in clinical assessment and treatment planning, which requires multiple MR modalities providing complementary information. However, due to practical limits, one or more modalities may be missing in real scenarios. To tackle this problem, existing methods need to train multiple networks or a unified but fixed network for various possible missing modality cases, which leads to high computational burdens or sub-optimal performance. In this paper, we propose a unified and adaptive multi-modal MR image synthesis method, and further apply it to tumor segmentation with missing modalities. Based on the decomposition of multi-modal MR images into common and modality-specific features, we design a shared hyper-encoder for embedding each available modality into the feature space, a graph-attention-based fusion block to aggregate the features of available modalities to the fused features, and a shared hyper-decoder for image reconstruction. We also propose an adversarial common feature constraint to enforce the fused features to be in a common space. As for missing modality segmentation, we first conduct the feature-level and image-level completion using our synthesis method and then segment the tumors based on the completed MR images together with the extracted common features. Moreover, we design a hypernet-based modulation module to adaptively utilize the real and synthetic modalities. Experimental results suggest that our method can not only synthesize reasonable multi-modal MR images, but also achieve state-of-the-art performance on brain tumor segmentation with missing modalities.
- Research Article
43
- 10.1016/j.cmpb.2021.106154
- May 13, 2021
- Computer Methods and Programs in Biomedicine
CLCU-Net: Cross-level connected U-shaped network with selective feature aggregation attention module for brain tumor segmentation