Advancing Colorectal Polyp Segmentation With Watershed Algorithm-Enhanced Parallel Self-Supervised Learning
Colorectal cancer (CRC) is a major health concern globally, known for its high prevalence and mortality rate. It typically arises from precancerous growths called polyps within the colon or rectum. Early detection and removal of these polyps are crucial for preventing CRC and improving patient outcomes. Traditional deep learning models require large amounts of labeled data, making them costly and time-consuming to develop. To address this, we present an innovative self-supervised learning (SSL) approach for colorectal polyp segmentation, which integrates watershed algorithm-enhanced pseudo-label generation with a novel parallel pretraining method. This approach leverages dual-task pre-training on image reconstruction and pseudo-segmentation to extract more relevant feature representations. Validated on the Kvasir-SEG dataset using three different models, our methodology demonstrates significant improvements in segmentation accuracy and efficiency, particularly in low-data scenarios, suggesting broader applicability in medical image analysis.
- Conference Article
89
- 10.1109/icmla.2019.00148
- Dec 1, 2019
Colorectal cancer (CRC) is one of the most commonly diagnosed cancers and a leading cause of cancer deaths in the United States. Colorectal polyps that grow on the intima of the colon or rectum is an important precursor for CRC. Currently, the most common way for colorectal polyp detection and precancerous pathology is the colonoscopy. Therefore, accurate colorectal polyp segmentation during the colonoscopy procedure has great clinical significance in CRC early detection and prevention. In this paper, we propose a novel end-to-end deep learning framework for the colorectal polyp segmentation. The model we design consists of an encoder to extract multi-scale semantic features and a decoder to expand the feature maps to a polyp segmentation map. We improve the feature representation ability of the encoder by introducing the dilated convolution to learn high-level semantic features without resolution reduction. We further design a simplified decoder which combines multi-scale semantic features with fewer parameters than the traditional architecture. Furthermore, we apply three post processing techniques on the output segmentation map to improve colorectal polyp detection performance. Our method achieves state-of-the-art results on CVC-ClinicDB and ETIS-Larib Polyp DB.
- Research Article
11
- 10.1016/j.engappai.2024.108962
- Jul 26, 2024
- Engineering Applications of Artificial Intelligence
Rethinking encoder-decoder architecture using vision transformer for colorectal polyp and surgical instruments segmentation
- Conference Article
95
- 10.1109/cisp-bmei.2017.8301980
- Oct 1, 2017
Colorectal cancer is the third common cancer in the United States and most colorectal cancer is associated with colorectal polyps. In hospital, colonoscopy is a common way to detect colorectal polyps. Colorectal polyps segmentation plays an important role in the diagnosis and prevention of digestive system related diseases. Therefore, there is a pressure-need for polyp segmentation computer-aided system to help doctors in diagnosis. In this paper, we propose a new, end-to-end fully convolutional neural network structure for segmenting colorectal polyps. This method can directly output a prediction map of the same size as the original image of the input network. We use the CVC-ClinicDB database to evaluate our method. Proposed method achieves accuracy values of 96.98%, F1score values of 83.01%, sensitivity values of 77.32% and specificity values of 99.05%.
- Research Article
- 10.1117/1.jmi.12.6.064004
- Dec 4, 2025
- Journal of medical imaging (Bellingham, Wash.)
Accurate segmentation and precise delineation of colorectal polyp structures are crucial for early clinical diagnosis and treatment planning. However, existing polyp segmentation techniques face significant challenges due to the high variability in polyp size and morphology, as well as the frequent indistinctness of polyp-tissue structures. To address these challenges, we propose a multiscale attention network with structure guidance (MAN-SG). The core of MAN-SG is a structure extraction module (SEM) designed to capture rich structural information from fine-grained early-stage encoder features. In addition, we introduce a cross-scale structure guided attention (CSGA) module that effectively fuses multiscale features under the guidance of the structural information provided by the SEM, thereby enabling more accurate delineation of polyp structures. MAN-SG is implemented and evaluated using two high-performance backbone networks: Res2Net-50 and PVTv2-B2. Extensive experiments were conducted on five benchmark datasets for polyp segmentation. The results demonstrate that MAN-SG consistently outperforms existing state-of-the-art methods across these datasets. The proposed MAN-SG framework, which leverages structural guidance via SEM and CSGA modules, proves to be both highly effective and robust for the challenging task of colorectal polyp segmentation.
- Research Article
47
- 10.1016/j.compbiomed.2023.107028
- May 10, 2023
- Computers in Biology and Medicine
PPNet: Pyramid pooling based network for polyp segmentation
- Research Article
9
- 10.1002/ima.23062
- Apr 30, 2024
- International Journal of Imaging Systems and Technology
Colorectal cancer is a common gastrointestinal malignancy. Early screening and segmentation of colorectal polyps are of great clinical significance. Colonoscopy is the most effective method to detect polyps, but some polyps may be missed in the detection process. On this basis, the use of computer‐aided diagnosis technology is particularly important for colorectal polyp segmentation. To improve the detection rate of intestinal polyps under colonoscopy, a polyp segmentation network (MobileRaNet) based on a lightweight model and reverse attention (RA) mechanism was proposed to accurately segment polyps in colonoscopy images. The coordinated attention module is used to improve MobileNetV3 and make it the backbone network (CaNet). Second, a part of the output of the high‐level feature from the backbone network is passed into the parallel axial receptive field module (PA_RFB) to extract the global dependency representation without losing the details. Third, a global map is generated based on this combined feature as the initial boot area of the subsequent components. Finally, the RA module is used to mine the target region and boundary clues to improve the segmentation accuracy. To verify the effectiveness and lightweight performance of the algorithm, five challenging datasets, including CVC‐ColonDB, CVC‐300, and Kvasir, are used in this paper. In six indexes, including MeanDice, MeanIoU, and MAE, compared with seven typical models such as PraNet and TransUnet, accuracy, FLOPs, parameters, and FPS were compared. The experimental results show that the MobileRaNet proposed in this paper has improved the performance of the five datasets to varying degrees, especially the MeanDice and MeanIOU indexes of the Kvasir dataset reach 91.2% and 85.6%, which are, respectively, increased by 1.4% and 1.6% compared with PraNet. Compared with PraNet, FLOPs and parameters decreased by 83.3% and 76.7%, respectively.
- Research Article
13
- 10.3390/jimaging8060169
- Jun 14, 2022
- Journal of Imaging
Colon polyps, small clump of cells on the lining of the colon, can lead to colorectal cancer (CRC), one of the leading types of cancer globally. Hence, early detection of these polyps automatically is crucial in the prevention of CRC. The deep learning models proposed for the detection and segmentation of colorectal polyps are resource-consuming. This paper proposes a lightweight deep learning model for colorectal polyp segmentation that achieved state-of-the-art accuracy while significantly reducing the model size and complexity. The proposed deep learning autoencoder model employs a set of state-of-the-art architectural blocks and optimization objective functions to achieve the desired efficiency. The model is trained and tested on five publicly available colorectal polyp segmentation datasets (CVC-ClinicDB, CVC-ColonDB, EndoScene, Kvasir, and ETIS). We also performed ablation testing on the model to test various aspects of the autoencoder architecture. We performed the model evaluation by using most of the common image-segmentation metrics. The backbone model achieved a DICE score of 0.935 on the Kvasir dataset and 0.945 on the CVC-ClinicDB dataset, improving the accuracy by 4.12% and 5.12%, respectively, over the current state-of-the-art network, while using 88 times fewer parameters, 40 times less storage space, and being computationally 17 times more efficient. Our ablation study showed that the addition of ConvSkip in the autoencoder slightly improves the model’s performance but it was not significant (p-value = 0.815).
- Research Article
- 10.1007/s10462-025-11369-6
- Oct 27, 2025
- Artificial Intelligence Review
Accurate segmentation of medical diseases, particularly in the detection and delineation of colorectal polyps, remains a critical challenge in medical diagnostics, as traditional image processing techniques often fail to capture the complexity and variability of polyp data, leading to inconsistent results and potentially impacting clinical outcomes. This review aims to study and analyze the latest 110 deep learning (DL) techniques from 2018 to 2024 with more than 100 open-source codes for polyp segmentation in a single review paper, with a focus on semantic networks, attention mechanisms, multiscale cascades, and transformer architectures, exploring their potential to improve the accuracy and robustness of colorectal polyp segmentation. Through a comprehensive review of existing literature, we classify and assess key methodologies, including single network models, multiple network models, hybrid models, and transformer-based models, particularly in their ability to handle variability in polyps’ patterns and enhance model interpretability. Our findings indicate that transformer-based architectures, especially those employing self-attention mechanisms, significantly enhance segmentation accuracy compared to traditional convolutional approaches, while semantic networks and multiscale cascades also show improved performance in addressing polyp variability across different scales. However, these advanced models bring challenges in terms of computational complexity and resource demands. The integration of these DL techniques offers transformative potential for improving diagnostic accuracy in colorectal polyp segmentation, and future research should focus on optimizing these models for clinical application by addressing computational demands and enhancing generalizability across diverse datasets, providing a roadmap for future development in colonoscopy imaging.
- Research Article
10
- 10.1016/j.bspc.2024.106210
- Mar 15, 2024
- Biomedical Signal Processing and Control
EfficientPolypSeg: Efficient Polyp Segmentation in colonoscopy images using EfficientNet-B5 with dilated blocks and attention mechanisms
- Research Article
66
- 10.1016/j.neunet.2023.11.050
- Nov 24, 2023
- Neural Networks
Boundary uncertainty aware network for automated polyp segmentation
- Conference Article
5
- 10.1109/nics54270.2021.9701580
- Dec 21, 2021
Automatic polyp detection and segmentation are desirable for colon screening because the polyps miss rate in clinical practice is relatively high. The deep learning-based approach for polyp segmentation has gained much attention in recent years due to the automatic feature extraction process to segment polyp regions with unprecedented precision. However, training these networks requires a large amount of manually annotated data, which is limited by the available resources of endoscopic doctors. We propose a self-supervised visual learning method for polyp segmentation to address this challenge. We adapted self-supervised visual feature learning with image reconstruction as a pretext task and polyp segmentation as a downstream task. UNet is used as the backbone architecture for both the pretext task and the downstream task. The unlabeled colonoscopy image dataset is used to train the pretext network. For polyp segmentation, we apply transfer learning on the pretext network. The polyp segmentation network is trained using a public benchmark dataset for polyp segmentation. Our experiments demonstrate that the proposed self-supervised learning method can achieve a better segmentation accuracy than an UNet trained from scratch. On the CVC-ColonDB polyp segmentation dataset with only annotated 300 images, the proposed method improves IoU metric from 76.87% to 81.99% and Dice metric from 86.61% to 89.33% for polyp segmentation, compared to the baseline UNet.
- Research Article
55
- 10.1109/jsen.2020.3015831
- May 15, 2021
- IEEE Sensors Journal
Untreated colorectal polyps can develop into colorectal cancer, which is a leading cause of cancer-related deaths. Colonoscopy is a commonly-used method for colorectal polyp scanning, but limited to the experience and subjectivity of clinicians, one out of four polyps cannot be correctly recognized. In this article, we propose an automatic colorectal polyp segmentation system based on the deep convolutional neural network, aiming to improve the accuracy of colorectal polyp scanning. The proposed ABC-Net is comprised of a shared encoder and two novel mutually-constrained decoders for simultaneous polyp area and boundary segmentation. To sufficiently exploit multi-scale image information, the selective feature modules are embedded into the network and used for dynamically learning and fusing multi-scale feature representations. Furthermore, a new boundary-sensitive loss is proposed to model the interdependencies between the area and boundary branches, the information of the two branches are reciprocally propagated and constrained, yielding a significant improvement in segmentation accuracy. Extensive experiments are conducted on three public colorectal polyp datasets, and the results, e.g., F1 scores are 0.866, 0.915, 0.874 in EndoScene, Kvasir-SEG, and ETIS-Larib datasets, demonstrate the advantages of the proposed method.
- Research Article
13
- 10.1016/j.aej.2024.06.095
- Jul 6, 2024
- Alexandria Engineering Journal
Multi-scale and multi-path cascaded convolutional network for semantic segmentation of colorectal polyps
- Research Article
8
- 10.1016/j.compbiomed.2024.108186
- Feb 21, 2024
- Computers in biology and medicine
Unveiling camouflaged and partially occluded colorectal polyps: Introducing CPSNet for accurate colon polyp segmentation
- Research Article
5
- 10.3390/pr12051030
- May 19, 2024
- Processes
Efficient and precise colorectal polyp segmentation has significant implications for screening colorectal polyps. Although network variants derived from the Transformer network have high accuracy in segmenting colorectal polyps with complex shapes, they have two main shortcomings: (1) multi-level semantic information at the output of the encoder may result in information loss during the fusion process and (2) failure to adequately suppress background noise during segmentation. To address these challenges, we propose a cross-scale interaction fusion transformer for polyp segmentation (CIFFormer). Firstly, a novel feature supplement module (FSM) supplements the missing details and explores potential features to enhance the feature representations. Additionally, to mitigate the interference of background noise, we designed a cross-scale interactive fusion module (CIFM) that combines feature information between different layers to obtain more multi-scale and discriminative representative features. Furthermore, a boundary-assisted guidance module (BGM) is proposed to help the segmentation network obtain boundary-enhanced details. Extensive experiments on five typical datasets have demonstrated that CIFFormer has an obvious advantage in segmenting polyps. Specifically, CIFFormer achieved an mDice of 0.925 and an mIoU of 0.875 on the Kvasir-SEG dataset, achieving superior segmentation accuracy to competing methods.