Medical image segmentation using deep semantic-based methods: A review of techniques, applications and emerging trends
Medical image segmentation using deep semantic-based methods: A review of techniques, applications and emerging trends
- Research Article
161
- 10.1007/s10278-021-00556-w
- Jan 12, 2022
- Journal of digital imaging
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for various imaging related tasks such as artificial image generation to support AI training. GANs are especially useful for medical imaging-related tasks where training datasets are usually limited in size and heavily imbalanced against the diseased class. We present a systematic review, following the PRISMA guidelines, of recent GAN architectures used for medical image analysis to help the readers in making an informed decision before employing GANs in developing medical image classification and segmentation models. We have extracted 54 papers that highlight the capabilities and application of GANs in medical imaging from January 2015 to August 2020 and inclusion criteria for meta-analysis. Our results show four main architectures of GAN that are used for segmentation or classification in medical imaging. We provide a comprehensive overview of recent trends in the application of GANs in clinical diagnosis through medical image segmentation and classification and ultimately share experiences for task-based GAN implementations.
- Research Article
7
- 10.3389/fmed.2024.1394262
- Jun 25, 2024
- Frontiers in medicine
Rectal cancer (RC) is a globally prevalent malignant tumor, presenting significant challenges in its management and treatment. Currently, magnetic resonance imaging (MRI) offers superior soft tissue contrast and radiation-free effects for RC patients, making it the most widely used and effective detection method. In early screening, radiologists rely on patients' medical radiology characteristics and their extensive clinical experience for diagnosis. However, diagnostic accuracy may be hindered by factors such as limited expertise, visual fatigue, and image clarity issues, resulting in misdiagnosis or missed diagnosis. Moreover, the distribution of surrounding organs in RC is extensive with some organs having similar shapes to the tumor but unclear boundaries; these complexities greatly impede doctors' ability to diagnose RC accurately. With recent advancements in artificial intelligence, machine learning techniques like deep learning (DL) have demonstrated immense potential and broad prospects in medical image analysis. The emergence of this approach has significantly enhanced research capabilities in medical image classification, detection, and segmentation fields with particular emphasis on medical image segmentation. This review aims to discuss the developmental process of DL segmentation algorithms along with their application progress in lesion segmentation from MRI images of RC to provide theoretical guidance and support for further advancements in this field.
- Research Article
2
- 10.35629/5252-0612125135
- Dec 1, 2024
- International Journal of Advances in Engineering and Management
The rapid advancements in medical imaging technologies have significantly enhanced diagnostic accuracy and clinical decision-making in modern healthcare. Image segmentation and deep learning have emerged as transformative tools among these advancements. This article explores the pivotal role of image segmentation and deep learning in medical imaging, detailing their methodologies, applications, challenges, and future directions. Deep learning, particularly Convolutional Neural Networks (CNNs), has revolutionized medical imaging by automating the analysis of complex datasets and improving diagnostic precision. Image segmentation, a fundamental component of medical imaging, allows for delineating specific structures such as organs, tissues, and pathological regions. Together, these technologies have been applied in diverse fields, including oncology, cardiology, neurology, and ophthalmology, enabling applications such as tumor detection, organ segmentation, disease progression monitoring, and treatment planning. However, despite its transformative potential, the integration of deep learning into medical imaging faces several challenges. These include data scarcity, privacy concerns, interpretability issues, and regulatory hurdles. The article discusses various strategies to address these challenges, such as data augmentation, transfer learning, and the development of explainable AI models to ensure transparency and trustworthiness. Evaluation metrics, such as accuracy, sensitivity, specificity, and Dice Similarity Coefficient (DSC), are essential for assessing model performance. Rigorous clinical validation and regulatory approval are crucial to integrating deep learning systems into clinical workflows effectively. Looking ahead, the future of deep learning in medical imaging holds immense promise. Innovations like multimodal imaging, personalized medicine, and AI-driven automation are set to further revolutionize the field, enhancing the efficiency and accuracy of diagnostics. Collaborative efforts between clinicians, researchers, and AI developers will play a vital role in overcoming current limitations and driving progress. This article concludes by emphasizing the transformative potential of deep learning and image segmentation in medical imaging, highlighting their ability to improve diagnostic accuracy, streamline clinical workflows, and ultimately, enhance patient care. By addressing current challenges and continuing to innovate, these technologies are poised to redefine the landscape of medical diagnostics and treatment in the years to come.
- Research Article
5
- 10.1016/j.vrih.2024.04.001
- Jun 1, 2024
- Virtual Reality & Intelligent Hardware
A review of medical ocular image segmentation
- Research Article
1
- 10.11834/jig.211019
- Jan 1, 2023
- Journal of Image and Graphics
目的 针对人体组织器官及病灶区域的3维图像分割是计算机辅助医疗诊断的重要前提,是医学影像3维可视化的重要技术基础。深度学习方法在医学图像分割任务中的成功通常取决于大量有标注数据。半监督学习利用未标注数据容易获取的优点,在模型训练过程中使用少量标注数据和大量未标注数据进行学习,缓解了数据标注昂贵耗时的问题,在医学图像分割中受到了广泛关注。为更好地利用无标注数据,提升医学图像分割效果,提出一种新的一致性正则方法用于半监督3维医学图像分割。方法 模型以V-Net为基础架构,通过扩展网络结构,在均带有分割任务及回归任务属性的双任务主副解码器之间添加了用于正则化约束的交叉损失,构建了具有形状感知的基于双任务的交叉一致性正则网络SACC-Net (shape-aware cross-consistency regular network based on dual tasks),实现将数据层面和模型层面的扰动融合进多任务机制的一致性正则方法,使模型能够更好地利用未标注数据的有效先验信息,并且具有更好的泛化性能。结果 在MICCAI 2018(Medical Image Computing and Computer Assisted Intervention Society)心房分割挑战赛公布的数据集中的3维左心房核磁共振成像上验证本文算法,在仅使用训练集中10%的有标注数据的实验组中,Dice系数、Jaccard指数、HD(Hausdorff distance)距离和平均对称表面距离分别为88.01%、78.89%、8.19和2.09。在另一组仅使用20%的有标注数据的实验中Dice系数、Jaccard指数、HD距离和平均对称表面距离分别达到90.11%、82.11%、6.57和1.78。结论 本文提出的半监督分割模型性能显著,综合了数据、模型和任务层面的一致性正则约束,与其他半监督方法相比分割效果更好且具有更佳的泛化性能。;Objective Three-dimensional(3D)image segmentation of human tissues,organs and lesion areas is projected for computer-aided diagnosis and medical images-related 3D visualization. Thanks to the emerging deep learning technique,fully-supervised network models have been developing intensively in relevant to medical image segmentation tasks. However,it is challenged for a large amount of annotated data and 3D image segmentation data-labeled is costly and inefficient. Semi-supervised learning is focused on a small size of data-labeled and sufficient data-unlabeled in terms of easy acquisition of unlabeled data,which can alleviate the cost and time-consuming problem of data labeling. Our research is focused on new consistency regular method for semi-supervised 3D medical image segmentation model. To improve the medical image segmentation effect,our model can use unlabeled data through the fusion of different consistency methods. Method The network model is demonstrated on the V-Net,which can remove the residual structure of encoding and decoding. To get efficient features of unlabeled data,the proposed shape-aware cross-consistency regular network is introduced via the V-Net network structure extension on the basis of an encoder and two independent decoders-involved dual tasks (shape-aware cross-consistency regular network based on dual tasks(SACC-Net)),which are divided into a main decoder a and an auxiliary decoder b. The output of encoder-shared is transmitted to the two decoders after noise disturbance. At the same time,the two decoders can output the prediction results after each iteration. To increase the generalization and anti-noise ability of the model,it can minimize the difference between the two parts of the results during the training process. Additionally,the proportion of labeled samples in the training samples is extremely small because the feature distributions between the pre-processed medical image samples are relatively similar. To improve the learning ability of the model to segmented samples further,geometric prior information constraints are melted into the segmentation target. A shape-aware regression layer is added at the end of each decoder as well. During the training phase,each decoder can output two parts of the prediction results at the same time. That is,the total output of each iteration-after network consists of four parts. It can be used to decode the segmentation map SA and the signed distance map output by the decoder A,and the segmentation map SB and the signed distance map output by the decoder B through the dual-task consistency of each decoding part. To enhance the model’s ability and learn the effective features of segmentation targets to a greater extent, constraints and cross constraints can be used to realize a consistent regular method that combines data-level and modellevel disturbances with multi-task mechanisms,and make better use of unlabeled data. Result Our algorithm is validated on the MRI data set published in Atrium Segmentation Challenge held by MICCAI(Medical Image Computing and Computer Assisted Intervention Society)in 2018. The experiment is divided into two test groups based on the amounts proportion of labeled data. In the training set,10% annotated data is used only in the experimental group,the Dice coefficient, Jaccard index,HD (Hausdorff distance) distance,and average symmetric surface distance is reached to 88. 01%, 78. 89%,8. 19,and 2. 09 of each. In the other group,20% annotated data of the experiments are used only. The median Dice coefficient,Jaccard index,HD distance and average symmetric surface distance can be reached to 90. 14%, 82. 11%,6. 57,and 1. 78 each as well. Furthermore,In respect of the shape perception method using the level set function for regression tasks,the Dice evaluation index can be improved by 0. 69 and 0. 60 in comparison with shape-aware semi-supervised net(SASS Net)in 10% and 20% of the marked training results. Each improvement is reached to 1. 44% and 0. 72% in terms of comparative results of dual-task consistency(DTC)trained with 10% and 20% labeled data. Conclusion The semi-supervised segmentation model(SACC-Net)is illustrated for the criteria optimization for both of the region and boundary-based segmentation,which can incorporate the level-consistency of its data,model and task. The constrained method has its potential segmentation effect and generalization performance for semi-supervised methods.
- Research Article
31
- 10.1016/j.compbiomed.2023.107744
- Nov 23, 2023
- Computers in Biology and Medicine
Cross-domain attention-guided generative data augmentation for medical image analysis with limited data
- Research Article
6
- 10.21928/uhdjst.v4n2y2020.pp75-90
- Aug 27, 2020
- UHD Journal of Science and Technology
In modern globe, medical image analysis significantly participates in diagnosis process. In general, it involves five processes, such as medical image classification, medical image detection, medical image segmentation, medical image registration, and medical image localization. Medical imaging uses in diagnosis process for most of the human body organs, such as brain tumor, chest, breast, colonoscopy, retinal, and many other cases relate to medical image analysis using various modalities. Multi-modality images include magnetic resonance imaging, single photon emission computed tomography (CT), positron emission tomography, optical coherence tomography, confocal laser endoscopy, magnetic resonance spectroscopy, CT, X-ray, wireless capsule endoscopy, breast cancer, papanicolaou smear, hyper spectral image, and ultrasound use to diagnose different body organs and cases. Medical image analysis is appropriate environment to interact with automate intelligent system technologies. Among the intelligent systems deep learning (DL) is the modern one to manipulate medical image analysis processes and processing an image into fundamental components to extract meaningful information. The best model to establish its systems is deep convolutional neural network. This study relied on reviewing of some of these studies because of these reasons; improvements of medical imaging increase demand on automate systems of medical image analysis using DL, in most tested cases, accuracy of intelligent methods especially DL methods higher than accuracy of hand-crafted works. Furthermore, manually works need a lot of time compare to systematic diagnosis.
- Research Article
247
- 10.53941/ijndi0201006
- Mar 27, 2023
- International Journal of Network Dynamics and Intelligence
Survey/review study Deep Learning Attention Mechanism in Medical Image Analysis: Basics and Beyonds Xiang Li 1, Minglei Li 1, Pengfei Yan 1, Guanyi Li 1, Yuchen Jiang 1, Hao Luo 1,*, and Shen Yin 2 1 Department of Control Science and Engineering, Harbin Institute of Technology, Harbin 150001, China 2 Department of Mechanical and Industrial Engineering, Faculty of Engineering, Norwegian University of Science and Technology, Trondheim 7034, Norway * Correspondence: hao.luo@hit.edu.cn Received: 16 October 2022 Accepted: 25 November 2022 Published: 27 March 2023 Abstract: With the improvement of hardware computing power and the development of deep learning algorithms, a revolution of "artificial intelligence (AI) + medical image" is taking place. Benefiting from diversified modern medical measurement equipment, a large number of medical images will be produced in the clinical process. These images improve the diagnostic accuracy of doctors, but also increase the labor burden of doctors. Deep learning technology is expected to realize an auxiliary diagnosis and improve diagnostic efficiency. At present, the method of deep learning technology combined with attention mechanism is a research hotspot and has achieved state-of-the-art results in many medical image tasks. This paper reviews the deep learning attention methods in medical image analysis. A comprehensive literature survey is first conducted to analyze the keywords and literature. Then, we introduce the development and technical characteristics of the attention mechanism. For its application in medical image analysis, we summarize the related methods in medical image classification, segmentation, detection, and enhancement. The remaining challenges, potential solutions, and future research directions are also discussed.
- Research Article
2
- 10.1142/s0218001423570136
- Oct 1, 2023
- International Journal of Pattern Recognition and Artificial Intelligence
With the advent of deep neural networks, medical image analysis is able to predict results in advance in early detection and diagnosis of diseases found in the human body. Several deep neural network methodologies have been implemented for a quick and efficient analysis of medical images that detect and diagnose cancerous cell growth in any part of the human body. For improving the segmentation and classification accuracy, the paper has proposed a framework comprising modified DoubleU-Net for image segmentation and PolyNet architecture for image classification. The modified DoubleU-Net is composed of two U-Net architectures, in which U-Net1 makes use of ResNet-50 as an encoder in the place of VGG-16 (existing) and Atrous Spatial Pyramid Pooling (ASPP) is replaced by Waterfall Atrous Spatial Pooling (WASP) architecture in both U-Nets to improve the semantic image segmentation. For classifying the segmented medical images as benign or malignant, PolyNet architecture is implemented in the research. The research involves experiments on the brain tumor dataset and lung cancer dataset to analyze the performance of the proposed approach. The processing of the DoubleU-Net and modified DoubleU-Net is evaluated based on precision, Recall, Intersection over Union (IoU), and Dice Score as the performance metrics. Experimental findings indicate that the modified DoubleU-Net design outperformed the existing DoubleU-Net architecture in terms of performance parameters for segmentation. The efficiency of the PolyNet classifier has been evaluated against VGG-16 and Inception-V3 classifiers, in terms of accuracy, specificity, sensitivity, error rate, and computation time as the performance metrics. From the experimental results, it has been proved that the PolyNet classifier performs better than VGG-16 and Inception-V3 with improved accuracy, specificity, sensitivity, and computation time.
- Research Article
73
- 10.1007/s42979-022-01166-1
- May 17, 2022
- SN Computer Science
Medical image interpretation is an essential task for the correct diagnosis of many diseases. Pathologists, radiologists, physicians, and researchers rely heavily on medical images to perform diagnoses and develop new treatments. However, manual medical image analysis is tedious and time consuming, making it necessary to identify accurate automated methods. Deep learning—especially supervised deep learning—shows impressive performance in the classification, detection, and segmentation of medical images and has proven comparable in ability to humans. This survey aims to help researchers and practitioners of medical image analysis understand the key concepts and algorithms of supervised learning techniques. Specifically, this survey explains the performance metrics of supervised learning methods; summarizes the available medical datasets; studies the state-of-the-art supervised learning architectures for medical imaging processing, including convolutional neural networks (CNNs) and their corresponding algorithms, region-based CNNs and their variants, fully convolutional networks (FCN) and U-Net architecture; and discusses the trends and challenges in the application of supervised learning methods to medical image analysis. Supervised learning requires large labeled datasets to learn and achieve good performance, and data augmentation, transfer learning, and dropout techniques have widely been employed in medical image processing to overcome the lack of such datasets.
- Research Article
- 10.1371/journal.pone.0340108
- Jan 5, 2026
- PLOS One
Transformer-based deep learning architectures have achieved notable success across various medical image analysis tasks, driven by the global modeling capabilities of the self-attention mechanism. However, Transformer-based methods exhibit significant computational complexity and a large number of parameters, rendering them challenging to apply effectively in practical medical scenarios. Compared with Transformers, large-kernel Convolutional Neural Networks (CNNs) and Multi-Layer Perceptrons (MLPs) offer more efficient inference while retaining global contextual awareness. Therefore, we rethink the role of large-kernel CNNs and MLPs in medical image analysis and leverage them to replace the heavy self-attention operation, to strike a better balance between performance and efficiency. Specifically, we propose backbone models for medical image classification and segmentation, featured by three lightweight modules: Linear Attention Feed Forward Network (FFN) for enhancing lesion features, Spatial Encoding Module for integrating multi-scale lesion information, and Smooth Depth-Wise Convolution (DwConv) FFN for efficient interaction of channel features. Composed solely of lightweight convolutional and MLP operations, our method achieves a better balance between performance and efficiency, validated by the superior performances on five datasets with varying data scales and diseases, with 98.39% on SARS-COV2-CT-Scan, 98.12% on Monkeypox Skin Lesion Dataset, 98.58% on Large COVID-19-CT scan slice, 79.45% on Synapse and 91.28% on ACDC. The low computational cost, high-performance with limited training data, and generalizability to various of medical tasks make the proposed method a promising and practical solution for medical image classification and segmentation.
- Research Article
56
- 10.1016/j.eswa.2023.119939
- Mar 22, 2023
- Expert Systems with Applications
DSEU-net: A novel deep supervision SEU-net for medical ultrasound image segmentation
- Research Article
23
- 10.1155/2020/1645479
- May 15, 2020
- Complexity
Medical image segmentation is a key technology for image guidance. Therefore, the advantages and disadvantages of image segmentation play an important role in image-guided surgery. Traditional machine learning methods have achieved certain beneficial effects in medical image segmentation, but they have problems such as low classification accuracy and poor robustness. Deep learning theory has good generalizability and feature extraction ability, which provides a new idea for solving medical image segmentation problems. However, deep learning has problems in terms of its application to medical image segmentation: one is that the deep learning network structure cannot be constructed according to medical image characteristics; the other is that the generalizability y of the deep learning model is weak. To address these issues, this paper first adapts a neural network to medical image features by adding cross-layer connections to a traditional convolutional neural network. In addition, an optimized convolutional neural network model is established. The optimized convolutional neural network model can segment medical images using the features of two scales simultaneously. At the same time, to solve the generalizability problem of the deep learning model, an adaptive distribution function is designed according to the position of the hidden layer, and then the activation probability of each layer of neurons is set. This enhances the generalizability of the dropout model, and an adaptive dropout model is proposed. This model better addresses the problem of the weak generalizability of deep learning models. Based on the above ideas, this paper proposes a medical image segmentation algorithm based on an optimized convolutional neural network with adaptive dropout depth calculation. An ultrasonic tomographic image and lumbar CT medical image were separately segmented by the method of this paper. The experimental results show that not only are the segmentation effects of the proposed method improved compared with those of the traditional machine learning and other deep learning methods but also the method has a high adaptive segmentation ability for various medical images. The research work in this paper provides a new perspective for research on medical image segmentation.
- Research Article
22
- 10.1186/s12880-024-01401-6
- Sep 16, 2024
- BMC Medical Imaging
Recently emerged SAM-Med2D represents a state-of-the-art advancement in medical image segmentation. Through fine-tuning the Large Visual Model, Segment Anything Model (SAM), on extensive medical datasets, it has achieved impressive results in cross-modal medical image segmentation. However, its reliance on interactive prompts may restrict its applicability under specific conditions. To address this limitation, we introduce SAM-AutoMed, which achieves automatic segmentation of medical images by replacing the original prompt encoder with an improved MobileNet v3 backbone. The performance on multiple datasets surpasses both SAM and SAM-Med2D. Current enhancements on the Large Visual Model SAM lack applications in the field of medical image classification. Therefore, we introduce SAM-MedCls, which combines the encoder of SAM-Med2D with our designed attention modules to construct an end-to-end medical image classification model. It performs well on datasets of various modalities, even achieving state-of-the-art results, indicating its potential to become a universal model for medical image classification.
- Research Article
61
- 10.1109/access.2020.2987932
- Jan 1, 2020
- IEEE Access
The classification and segmentation of pathologies through intelligent systems is a significant challenge for medical image analysis and computer vision systems. Diseases, such as lung problems and strokes, have a serious effect on human health worldwide. Lung diseases are among the leading causes of death worldwide, lagging behind strokes that in 2016 became the second leading cause of death from illnesses. Computed tomography (CT) is one of the main clinical diagnostic exams, linked to Computerized Diagnostic Assistance Systems (CAD), which are becoming solutions for health technologies. In this work, we propose a method based on the health of things for the classification and segmentation of CT images of the lung and hemorrhagic stroke. The system called HTSCS - Medical Images: Health-of-Things System for the Classification and Segmentation of Medical Images, uses transfer learning between models based on deep learning combined with classical methods for fine-tuning. The proposed method obtained excellent results for the classification of hemorrhagic stroke and pulmonary regions, with values of up to 100% accuracy. The models also achieved outstanding performances for segmentation, with Accuracy above 99 % and Dice coefficient above 97% in the best cases with an average segmentation time between 0.095 and 1.7 seconds. To validate our approach, we compared our best models for the segmentation of lung and hemorrhagic stroke in CTs, with related works found in state of the art. Our method brings an innovative approach to classification and segmentation through the use of the Health of Things for different types of medical images with promising results for medical image analysis and computer vision fields.