Contour-Weighted Loss For Class-Imbalanced Image Segmentation
Image segmentation is critically important in almost all medical image analysis for automatic interpretations and processing. However, it is often challenging to perform image segmentation due to data imbalance between intra-and inter-class, resulting in over-or under-segmentation. Consequently, we proposed a new methodology to address the above issue, with a compact yet effective contour-weighted loss function. Our new loss function incorporates a contour-weighted cross-entropy loss and separable dice loss. The former loss extracts the contour of target regions via morphological erosion and generates a weight map for the cross-entropy criterion, whereas the latter divides the target regions into contour and non-contour components through the extracted contour map, calculates dice loss separately, and combines them to update the network. We carried out abdominal organ segmentation and brain tumor segmentation on two public datasets to assess our approach. Experimental results demonstrated that our approach offered superior segmentation, as compared to several state-of-the-art methods, while in parallel improving the robustness of those popular state-of-the-art deep models through our new loss function. The code is available at https://github.com/ huangzyong/Contour-weighted-Loss-Seg.
- Research Article
49
- 10.1609/aaai.v36i2.20139
- Jun 28, 2022
- Proceedings of the AAAI Conference on Artificial Intelligence
This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged in to the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets.
- Research Article
9
- 10.1016/j.patcog.2022.109208
- Nov 25, 2022
- Pattern Recognition
We propose Region-wise (RW) loss for biomedical image segmentation. Region-wise loss is versatile, can simultaneously account for class imbalance and pixel importance, and it can be easily implemented as the pixel-wise multiplication between the softmax output and a RW map. We show that, under the proposed RW loss framework, certain loss functions, such as Active Contour and Boundary loss, can be reformulated similarly with appropriate RW maps, thus revealing their underlying similarities and a new perspective to understand these loss functions. We investigate the observed optimization instability caused by certain RW maps, such as Boundary loss distance maps, and we introduce a mathematically-grounded principle to avoid such instability. This principle provides excellent adaptability to any dataset and practically ensures convergence without extra regularization terms or optimization tricks. Following this principle, we propose a simple version of boundary distance maps called rectified Region-wise (RRW) maps that, as we demonstrate in our experiments, achieve state-of-the-art performance with similar or better Dice coefficients and Hausdorff distances than Dice, Focal, weighted Cross entropy, and Boundary losses in three distinct segmentation tasks. We quantify the optimization instability provided by Boundary loss distance maps, and we empirically show that our RRW maps are stable to optimize. The code to run all our experiments is publicly available at: https://github.com/jmlipman/RegionWiseLoss.
- Research Article
83
- 10.1109/tmm.2020.3007331
- Jul 8, 2020
- IEEE Transactions on Multimedia
Semantic context plays a significant role in image segmentation. However, few prior works have explored semantic contexts for 3D point cloud segmentation. In this paper, we propose a simple yet effective Point Context Encoding (PointCE) module to capture semantic contexts of a point cloud and adaptively highlight intermediate feature maps. We also introduce a Semantic Context Encoding loss (SCE-loss) to supervise the network to learn rich semantic context features. To avoid hyperparameter tuning and achieve better convergence performance, we further propose a geometric mean loss to integrate both SCE-loss and segmentation loss. Our PointCE module is general and lightweight, and can be integrated into any point cloud segmentation architecture to improve its segmentation performance with only marginal extra overheads. Experimental results on the ScanNet, S3DIS and Semantic3D datasets show that consistent and significant improvement can be achieved for several different networks by integrating our PointCE module.
- Book Chapter
- 10.1007/978-3-031-15934-3_2
- Jan 1, 2022
The research on making accurate segmentation of images in ultrasound inspecting is a challenging in the medical image segmentation domain. It is tough to obtain a satisfactory segmentation of U-Net networks in deep learning. The difficulties are contributed to low contrast between detected targets and surrounding tissues, the large differences between target edges and shapes, and so forth. Based on batch-free normalization (BFN) and a residual attention block, a class of Attention Res BFN U-Net (ARB U-Net) network with a deep encoder and a shallow decoder is proposed, and the depth and the performance of the network is improved. With utilizing Dice loss and BCE loss are utilized as segmentation loss and classification loss respectively, a kind of Dice-BCE loss function is constructed on the basis of multi-task weighting strategy. 450 ultrasound images were used as the training set and another 50 images were used as the test set. The average segmentation accuracy of the test data set reached 97.1%, which is about 3% better than that of the traditional U-Net and its common variants. The experimental results show that the proposed network can significantly improve the accuracy and precision of ultrasound image segmentation of suprapatellar bursa.
- Conference Article
4
- 10.1109/sibgrapi54419.2021.00055
- Oct 1, 2021
Image segmentation is an ill-posed problem by definition, as it is not always possible to automatically select which object appearing in an image is the object of interest. To deal with this issue, prior knowledge in the form of human-given markers can be included in the segmentation pipeline. Even though user interaction can drastically improve segmentation results, it is an expensive resource, and finding ways to reduce human effort on an interactive segmentation loop is of great interest. In this work, we propose a new segmentation layer to be used with deep neural networks, which allows us to create and train in an end-to-end fashion a marker creation network. To train the network, we propose a loss function composed of: a segmentation loss using the proposed differentiable segmentation layer; and a set of regularization functions that enforce the desired characteristics on the produced markers. We showed that by using the proposed layer and loss function, we can train the network to automatically generate markers that recover a good segmentation and have desirable shape characteristics. This behavior is observed on the training dataset, as well as on four unseen datasets.
- Research Article
18
- 10.1109/embc.2019.8857527
- Jul 1, 2019
- Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Ischaemic stroke is a medical condition caused by occlusion of blood supply to the brain tissue thus forming a lesion. A lesion is zoned into a core associated with irreversible necrosis typically located at the center of the lesion, while reversible hypoxic changes in the outer regions of the lesion are termed as the penumbra. Early estimation of core and penumbra in ischaemic stroke is crucial for timely intervention with thrombolytic therapy to reverse the damage and restore normalcy. Multisequence magnetic resonance imaging (MRI) is commonly employed for clinical diagnosis. However, a sequence singly has not been found to be sufficiently able to differentiate between core and penumbra, while a combination of sequences is required to determine the extent of the damage. The challenge, however, is that with an increase in the number of sequences, it cognitively taxes the clinician to discover symptomatic biomarkers in these images. In this paper, we present a data-driven fully automated method for estimation of core and penumbra in ischaemic lesions using diffusion-weighted imaging (DWI) and perfusion-weighted imaging (PWI) sequence maps of MRI. The method employs recent developments in convolutional neural networks (CNN) for semantic segmentation in medical images. In the absence of availability of a large amount of labeled data, the CNN is trained using an adversarial approach employing cross-entropy as a segmentation loss along with losses aggregated from three discriminators of which two employ relativistic visual Turing test. This method is experimentally validated on the ISLES-2015 dataset through three-fold cross-validation to obtain with an average Dice score of 0.82 and 0.73 for segmentation of penumbra and core respectively.
- Conference Article
- 10.1117/12.2601650
- Apr 4, 2022
Precise tissue segmentation of histopathology images is often a crucial step in computational pathology pipelines. However, visual scoring by pathologists is sensitive and depends on their experience and perception. Therefore, there is a need for novel automatic systems to improve the accuracy and reproducibility of pathologists' interpretations. Here, a group affinity weakly supervised segmentation method (GAWS) is proposed to conquer this task, with the following pipeline. First, we create a cluster image by extracting the visual feature of each pixel using CNN and clustering it into different classes. Then, we create a target image by refining this cluster image with the constraints on prior tissue, color, and spatial distribution of pixels. Finally, a backpropagation process with a segmentation loss is considered to evaluate the error signals between cluster and target images and update the network parameters. We validate our method with extracellular mucin-to-tumor area quantification using a colorectal cancer clinical dataset with 163 Hematoxylin Eosin (H&E) whole slide images from 97 patients. Inter-observer agreement between pathologists and the proposed algorithm is excellent (ICC=0.917) and more accurate compared with two state-of-the-art unsupervised segmentation methods. Our results show that the GAWS results in a high average performance and excellent reliability when applied to histopathology images and possibly is a promising method for inclusion into clinical practice. This approach takes advantage of weakly supervised learning without any pre-trained network to have a tumor quantification tool that could improve the pathologist's workflow.
- Book Chapter
4
- 10.1007/978-3-031-16434-7_25
- Jan 1, 2022
Semantic segmentation of whole slide images (WSIs) helps pathologists identify lesions and cancerous nests. However, training fully supervised segmentation networks usually requires plenty of pixel-level annotations, which consume lots of time and human efforts. Coming from tissues of different patients with large amounts of pixels, WSIs exhibit various patterns, resulting in intra-class heterogeneity and inter-class homogeneity. Meanwhile, most existing methods for WSIs focus on extracting a certain type of features, neglecting the relations between different features and their joint effect on segmentation. Therefore, we propose a novel weakly supervised network based on tensor graphs (WSNTG) for WSI segmentation. Using only sparse point annotations, it efficiently segments WSIs by superpixel-wise classification and credible node reweighting. To deal with the variability of WSIs, the proposed network represents multiple hand-crafted features and hierarchical features yielded by a pretrained Convolutional Neural Network (CNN). Particularly, it learns over the semi-labeled tensor graphs constructed on the hierarchical features to exploit nonlinear data structures and associations. It gains robustness via the tensor-graph Laplacian of the hand-crafted features superimposed on the segmentation loss. We evaluated WSNTG on two WSI datasets, DigestPath2019 and SICAPV2. Results show that it outperforms many fully supervised and weakly supervised methods with minimal point annotations in WSI segmentation. The codes are published at https://github.com/zqh369/WSNTG.KeywordsWeakly-supervised segmentationPathology image segmentationGraph convolutional networksNode reweighting
- Research Article
12
- 10.1117/12.2582127
- Feb 15, 2021
- Proceedings of SPIE--the International Society for Optical Engineering
Accurately segmenting organs in abdominal computed tomography (CT) scans is crucial for clinical applications such as pre-operative planning and dose estimation. With the recent advent of deep learning algorithms, many robust frameworks have been proposed for organ segmentation in abdominal CT images. However, many of these frameworks require large amounts of training data in order to achieve high segmentation accuracy. Pediatric abdominal CT images containing reproductive organs are particularly hard to obtain since these organs are extremely sensitive to ionizing radiation. Hence, it is extremely challenging to train automatic segmentation algorithms on organs such as the uterus and the prostate. To address these issues, we propose a novel segmentation network with a built-in auxiliary classifier generative adversarial network (ACGAN) that conditionally generates additional features during training. The proposed CFG-SegNet (conditional feature generation segmentation network) is trained on a single loss function which combines adversarial loss, reconstruction loss, auxiliary classifier loss and segmentation loss. 2.5D segmentation experiments are performed on a custom data set containing 24 female CT volumes containing the uterus and 40 male CT volumes containing the prostate. CFG-SegNet achieves an average segmentation accuracy of 0.929 DSC (Dice Similarity Coefficient) on the prostate and 0.724 DSC on the uterus with 4-fold cross validation. The results show that our network is high-performing and has the potential to precisely segment difficult organs with few available training images.
- Research Article
12
- 10.1016/j.cmpb.2024.108178
- Apr 21, 2024
- Computer Methods and Programs in Biomedicine
VENet: Variational energy network for gland segmentation of pathological images and early gastric cancer diagnosis of whole slide images
- Conference Article
5
- 10.1109/spices52834.2022.9774193
- Mar 10, 2022
Semantic segmentation using deep learning techniques is now state-of-the-art in medical image segmentation, especially for brain Magnetic Resonant images (MRI). SegNet, a fully convolutional neural network architecture, is widely used for image segmentation. However, it is less frequently used than the other popular competing approach of U-Net, for brain MRI segmentation. A few researchers have proposed a fusion of SegNet and U-Net architectures to combine their desirable properties for performance gains. In this paper, a different direction of research is undertaken. A simpler yet more accurate shallow SegNet architecture is proposed to yield promising segmentation performance on brain MRI. The proposed architecture uses a bilinear interpolation upsampling mechanism instead of max unpooling. Further, a modified cross-entropy loss is employed that is weighted differently for different classes. The class imbalance problem is effectively overcome using this weighted cross-entropy loss. Performance comparison of the proposed architecture with existing works indicates that the average dice factor is enhanced to 0.83 with an improvement of 0.11 over the baseline SegNet. It is demonstrated that the proposed shallow SegNet is a simpler yet more accurate model compared to both the existing SegNet and U-Net and could serve as a baseline for fine-grained image segmentation tasks.
- Research Article
189
- 10.1016/j.media.2023.102792
- Jul 1, 2023
- Medical Image Analysis
Supervised deep learning-based methods yield accurate results for medical image segmentation. However, they require large labeled datasets for this, and obtaining them is a laborious task that requires clinical expertise. Semi/self-supervised learning-based approaches address this limitation by exploiting unlabeled data along with limited annotated data. Recent self-supervised learning methods use contrastive loss to learn good global level representations from unlabeled images and achieve high performance in classification tasks on popular natural image datasets like ImageNet. In pixel-level prediction tasks such as segmentation, it is crucial to also learn good local level representations along with global representations to achieve better accuracy. However, the impact of the existing local contrastive loss-based methods remains limited for learning good local representations because similar and dissimilar local regions are defined based on random augmentations and spatial proximity; not based on the semantic label of local regions due to lack of large-scale expert annotations in the semi/self-supervised setting. In this paper, we propose a local contrastive loss to learn good pixel level features useful for segmentation by exploiting semantic label information obtained from pseudo-labels of unlabeled images alongside limited annotated images with ground truth (GT) labels. In particular, we define the proposed contrastive loss to encourage similar representations for the pixels that have the same pseudo-label/GT label while being dissimilar to the representation of pixels with different pseudo-label/GT label in the dataset. We perform pseudo-label based self-training and train the network by jointly optimizing the proposed contrastive loss on both labeled and unlabeled sets and segmentation loss on only the limited labeled set. We evaluated the proposed approach on three public medical datasets of cardiac and prostate anatomies, and obtain high segmentation performance with a limited labeled set of one or two 3D volumes. Extensive comparisons with the state-of-the-art semi-supervised and data augmentation methods and concurrent contrastive learning methods demonstrate the substantial improvement achieved by the proposed method. The code is made publicly available at https://github.com/krishnabits001/pseudo_label_contrastive_training.
- Research Article
6
- 10.1002/cav.2023
- May 24, 2021
- Computer Animation and Virtual Worlds
While extensive research efforts have been made in semantic image segmentation, the state‐of‐the‐art methods still suffer from blurry boundaries and mismatched objects due to the insufficient multiscale adaptability. In this paper, we propose a two‐branch convolutional neural network (CNN) approach to capture the multiscale context and the boundary information with the two branches, respectively. To capture the multiscale context, we propose to embed self‐attention mechanism to the atrous spatial pyramid pooling network. To capture the boundary information, we propose to fuse the low‐level features in boundary feature extraction for refining the extracted boundaries via a feature fusion layer (FFL). With FFL, our method can improve the segmentation result with clearer boundaries. A new loss function is proposed which contains a segmentation loss and a boundary loss. Experiments show that our method can predict the boundaries of objects more clearly and have better performance for small‐scale objects.
- Research Article
1
- 10.3390/sym17111807
- Oct 27, 2025
- Symmetry
Cardiac medical image segmentation can advance healthcare and embedded vision systems. In this paper, a symmetric semantic segmentation architecture for cardiac magnetic resonance (MR) images based on a symmetric multiscale detail-guided attention network is presented. Detailed information and multiscale attention maps can be exploited more efficiently in this model. A symmetric encoder and decoder are used to generate high-dimensional semantic feature maps and segmentation masks, respectively. First, a series of densely connected residual blocks is introduced for extracting high-dimensional semantic features. Second, an asymmetric detail-guided module is proposed. In this module, a feature pyramid is used to extract detailed information and generate detailed feature maps as part of the detail guidance of the model during the training phase, which are used to extract deep features of multiscale information and calculate a detail loss with specific encoder semantic features. Third, a series of multiscale upsampling attention blocks symmetrical to the encoder is introduced in the decoder of the model. For each upsampling attention block, feature fusion is first performed on the previous-level low-resolution features and the symmetric skip connections of the same layer, and then spatial and channel attention are used to enhance the features. Image gradients of the input images are also introduced at the end of the decoder. Finally, the predicted segmentation masks are obtained by calculating a detail loss and a segmentation loss. Our method demonstrates outstanding performance on the public cardiac MR image dataset, which can achieve significant results for endocardial and epicardial segmentation of the left ventricle (LV).
- Research Article
3
- 10.1016/j.compbiomed.2022.106326
- Nov 16, 2022
- Computers in Biology and Medicine
Learning to segment subcortical structures from noisy annotations with a novel uncertainty-reliability aware learning framework