Style Consistent Image Generation for Nuclei Instance Segmentation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

In medical image analysis, one limitation of the application of machine learning is the insufficient amount of data with detailed annotation, due primarily to high cost. Another impediment is the domain gap observed between images from different organs and different collections. The differences are even more challenging for the nuclei instance segmentation, where images have significant nuclei stain distribution variations and complex pleomorphisms (sizes and shapes). In this work, we generate style consistent histopathology images for nuclei instance segmentation. We set up a novel instance segmentation framework that integrates a generator and discriminator into the segmentation pipeline with adversarial training to generalize nuclei instances and texture patterns. A segmentation net detects and segments both real nuclei and synthetic nuclei and provides feedback so that the generator can synthesize images that can boost the segmentation performance. Experimental results on three public nuclei datasets indicate that our proposed method outperforms previous nuclei segmentation methods.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 30
  • 10.1109/access.2020.3003917
Syncretic-NMS: A Merging Non-Maximum Suppression Algorithm for Instance Segmentation
  • Jan 1, 2020
  • IEEE Access
  • Jun Chu + 4 more

Instance segmentation is typically based on an object detection framework. Semantic segmentation is conducted on the bounding boxes that are returned by detectors. NMS (non-maximum suppression) is a common post-processing operation in instance segmentation and object detection tasks. It is typically used after bounding box regression to eliminate redundant bounding boxes. The evaluation criteria for object detection require that the bounding box be as close as possible to the ground truth, but they do not emphasize the integrity of the included object. However, sometimes the bounding boxes cannot contain the complete objects, and the parts beyond the bounding boxes cannot be correctly predicted in the subsequent semantic segmentation. To solve this problem, we propose the Syncretic-NMS algorithm. The algorithm takes traditional NMS as the first step and processes the bounding boxes obtained by traditional NMS, judges the neighboring bounding boxes of each bounding box, and combines the neighboring boxes that are strongly correlated with the corresponding bounding boxes. The coordinates of the merged box are the four coordinate extremes of the bounding box and the highly relevant neighboring box. The neighboring box with strong correlation is merged with the corresponding bounding box. Based on an analysis of the influences of corresponding factors, the criteria for correlation judgment are specified. Experimental results on the MS COCO dataset demonstrate that Syncretic-NMS can steadily increase the accuracy of instance segmentation, while experimental results on the Cityscapes dataset prove that the algorithm can adapt to application scenario changes. The computational complexity of Syncretic-NMS is the same as that of traditional NMS. Syncretic-NMS is easy to implement, requires no additional training, and can be easily integrated into the available instance segmentation framework.

  • Dissertation
  • 10.17760/d20439211
ARID
  • Aug 24, 2022
  • Rajwinder Singh

Instance segmentation algorithms are used everywhere, be it self driving cars, scene mapping by autonomous robots or analyzing medical scans. Instance segmentation can be thought of as further refinement of semantic segmentation. Object detection algorithms try to detect objects from the scene by enclosing them in bounding boxes, semantic segmentation tries to label these objects, whereas instance segmentation tries to label each unique instance of these objects. The task is quite complex and becomes even more challenging when the scope is microscopic data. Objects in microscopic data do not usually follow a fixed shape or orientation, therefore it becomes very difficult to identify unique instances of these objects using axis aligned bounding boxes. The alternative approach that researchers take is to do pixel wise prediction and then agglomerate those together to ultimately get the final object instances. In this thesis we presented a novel loss function which we have used to train a U-Net which predicts n-dimensional embedding maps or ARID(Affinity Representing Instance Descriptors). These embedding vectors contain dense information which can then be used to generate segmentation maps using the post processing approaches. Previous methods have attempted to learn affinities but are prone to errors resulting in erroneous segmentation. We show that our segmentation pipeline using ARID embedding map surpasses the performance of the affinity based networks and solve the problem of merge errors. Our segmentation pipeline have two phases, first one is predicting ARID embedding for which we have trained U-Net architecture using ultrametric loss. Multiple configurations were tested and compared. Second phase is post processing. Post processing is further divided in two steps segmentation generation and refinement. We presented a very basic technique to generate a euclidean minimum spanning tree and prune the edges with distance bigger than the provided threshold to generate segmentation. The other part of the post processing pipeline is segmentation refinement. Where we proposed approaches to refine the generated segmentation. We have used IOU scores under thresholds of Average Precision(AP) raging from 0.5 to 0.95 with an increment of 0.05 to evaluate the performance. The best average AP0.5 IOU score that we got from the affinity based networks is 0.63, we have shown that our segmentation pipeline generates the segmentation maps which gives the best average performance of 0.826 AP0.5 IOU score, surpassing the affinity based network performance. We have also shown the failure modes of our proposed loss function and presented future scope of research in the field. Embedding based approaches show promise to do efficient instance segmentation especially in complex scenes as is in the microscopic data. The generalized loss function that we have presented in this thesis is capable of doing this task, and presents a better alternative to using affinity based methods to do segmentation.--Author's abstract

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.neucom.2022.09.112
UniInst: Unique representation for end-to-end instance segmentation
  • Sep 21, 2022
  • Neurocomputing
  • Yimin Ou + 7 more

UniInst: Unique representation for end-to-end instance segmentation

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icpr56361.2022.9956531
DISF: Dynamic Instance Segmentation with Semantic Features
  • Aug 21, 2022
  • Hao Dong + 1 more

In this work, we propose a flexible and efficient instance segmentation framework, termed DISF (Dynamic Instance Segmentation with Semantic Features), which is a more novel two-stage instance segmentation framework. Firstly, we divide the image into multiple regions of the same size and directly classify the pixels in different regions, which converts the instance-level segmentation task into the pixel-level classification task within the region. We make full use of the location and size information of objects to distinguish different instances of the same category and obtain relatively coarse instance segmentation results. Secondly, we decouple the prediction of instance masks into convolution kernel prediction and instance features prediction. The instance masks are dynamically generated by convolution operations between the predicted convolution kernel and instance features. The segmentation results in this way do not contain redundant information. Thirdly, a parallel branch of semantic segmentation is added to refine instance segmentation results further. Semantic features provide global information about the image from a higher level. Semantic features and instance features are sent to the Features Fusion Module (FFM) to optimize the relatively coarse instance segmentation results generated in the previous stage. The experimental results reveal the promising potential of DISF in instance-level recognition.

  • Research Article
  • Cite Count Icon 24
  • 10.1016/j.compbiomed.2022.106180
A general deep learning framework for neuron instance segmentation based on Efficient UNet and morphological post-processing
  • Oct 4, 2022
  • Computers in biology and medicine
  • Huaqian Wu + 4 more

A general deep learning framework for neuron instance segmentation based on Efficient UNet and morphological post-processing

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/isbi.2019.8759390
Volume R-CNN: Unified Framework for CT Object Detection and Instance Segmentation
  • Apr 1, 2019
  • Yun Chen + 6 more

As a fundamental task in computer vision, object detection methods for the 2D image such as Faster R-CNN and SSD can be efficiently trained end-to-end. However, current methods for volumetric data like computed tomography (CT) usually contain two steps to do region proposal and classification separately. In this work, we present a unified framework called Volume R-CNN for object detection in volumetric data. Volume R-CNN is an end-to-end method that could perform region proposal, classification and instance segmentation all in one model, which dramatically reduces computational overhead and parameter numbers. These tasks are joined using a key component named RoIAlign3D that extracts features of RoIs smoothly and works superiorly well for small objects in the 3D image. To the best of our knowledge, Volume R-CNN is the first common end-to-end framework for both object detection and instance segmentation in CT. Without bells and whistles, our single model achieves remarkable results in LUNA16. Ablation experiments are conducted to analyze the effectiveness of our method.

  • Research Article
  • Cite Count Icon 21
  • 10.1609/aaai.v36i3.20227
SOIT: Segmenting Objects with Instance-Aware Transformers
  • Jun 28, 2022
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Xiaodong Yu + 5 more

This paper presents an end-to-end instance segmentation framework, termed SOIT, that Segments Objects with Instance-aware Transformers. Inspired by DETR, our method views instance segmentation as a direct set prediction problem and effectively removes the need for many hand-crafted components like RoI cropping, one-to-many label assignment, and non-maximum suppression (NMS). In SOIT, multiple queries are learned to directly reason a set of object embeddings of semantic category, bounding-box location, and pixel-wise mask in parallel under the global image context. The class and bounding-box can be easily embedded by a fixed-length vector. The pixel-wise mask, especially, is embedded by a group of parameters to construct a lightweight instance-aware transformer. Afterward, a full-resolution mask is produced by the instance-aware transformer without involving any RoI-based operation. Overall, SOIT introduces a simple single-stage instance segmentation framework that is both RoI- and NMS-free. Experimental results on the MS COCO dataset demonstrate that SOIT outperforms state-of-the-art instance segmentation approaches significantly. Moreover, the joint learning of multiple tasks in a unified query embedding can also substantially improve the detection performance. Code is available at https://github.com/yuxiaodongHRI/SOIT.

  • Conference Article
  • Cite Count Icon 132
  • 10.1109/cvpr42600.2020.01024
Mask Encoding for Single Shot Instance Segmentation
  • Jun 1, 2020
  • Rufeng Zhang + 4 more

To date, instance segmentation is dominated by two-stage methods, as pioneered by Mask R-CNN. In contrast, one-stage alternatives cannot compete with Mask R-CNN in mask AP, mainly due to the difficulty of compactly representing masks, making the design of one-stage methods very challenging. In this work, we propose a simple single-shot instance segmentation framework, termed mask encoding based instance segmentation (MEInst). Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector, which allows the instance segmentation task to be incorporated into one-stage bounding-box detectors and results in a simple yet efficient instance segmentation framework. The proposed one-stage MEInst achieves 36.4% in mask AP with single-model (ResNeXt-101-FPN backbone) and single-scale testing on the MS-COCO benchmark. We show that the much simpler and flexible one-stage instance segmentation method, can also achieve competitive performance. This framework can be easily adapted for other instance-level recognition tasks. Code is available at: git.io/AdelaiDet

  • Research Article
  • Cite Count Icon 24
  • 10.1016/j.isprsjprs.2021.10.003
Densely connected graph convolutional network for joint semantic and instance segmentation of indoor point clouds
  • Oct 16, 2021
  • ISPRS Journal of Photogrammetry and Remote Sensing
  • Yu Wang + 5 more

Densely connected graph convolutional network for joint semantic and instance segmentation of indoor point clouds

  • Research Article
  • Cite Count Icon 24
  • 10.1117/1.jmi.8.1.014001
Instance segmentation for whole slide imaging: end-to-end or detect-then-segment.
  • Jan 7, 2021
  • Journal of medical imaging (Bellingham, Wash.)
  • Aadarsh Jha + 5 more

Purpose: Automatic instance segmentation of glomeruli within kidney whole slide imaging (WSI) is essential for clinical research in renal pathology. In computer vision, the end-to-end instance segmentation methods (e.g., Mask-RCNN) have shown their advantages relative to detect-then-segment approaches by performing complementary detection and segmentation tasks simultaneously. As a result, the end-to-end Mask-RCNN approach has been the de facto standard method in recent glomerular segmentation studies, where downsampling and patch-based techniques are used to properly evaluate the high-resolution images from WSI (e.g., on ). However, in high-resolution WSI, a single glomerulus itself can be more than in original resolution which yields significant information loss when the corresponding features maps are downsampled to the resolution via the end-to-end Mask-RCNN pipeline. Approach: We assess if the end-to-end instance segmentation framework is optimal for high-resolution WSI objects by comparing Mask-RCNN with our proposed detect-then-segment framework. Beyond such a comparison, we also comprehensively evaluate the performance of our detect-then-segment pipeline through: (1)two of the most prevalent segmentation backbones (U-Net and DeepLab_v3); (2)six different image resolutions ( , , , , , and ); and (3)two different color spaces (RGB and LAB). Results: Our detect-then-segment pipeline, with the DeepLab_v3 segmentation framework operating on previously detected glomeruli of resolution, achieved a 0.953 Dice similarity coefficient (DSC), compared with a 0.902 DSC from the end-to-end Mask-RCNN pipeline. Further, we found that neither RGB nor LAB color spaces yield better performance when compared against each other in the context of a detect-then-segment framework. Conclusions: The detect-then-segment pipeline achieved better segmentation performance compared with the end-to-end method. Our study provides an extensive quantitative reference for other researchers to select the optimized and most accurate segmentation approach for glomeruli, or other biological objects of similar character, on high-resolution WSI.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/jsen.2023.3244818
Adaptive Long-Neck Network With Atrous-Residual Structure for Instance Segmentation
  • Apr 1, 2023
  • IEEE Sensors Journal
  • Wenjie Geng + 5 more

Instance segmentation is an important yet challenging task in the computer vision field. Existing mainstream single-stage solution with parameterized mask representation has designed the neck models to fuse features of different layers; however, the performance of instance segmentation is still restricted to the layer-by-layer transmission scheme. In this article, an instance segmentation framework with an adaptive long-neck (ALN) network and atrous-residual structure is proposed. The long-neck network is composed of two bidirectional fusion units, which are cascaded to facilitate the information communication among features of different layers in top-down and bottom-up pathways. In particular, a new cross-layer transmission scheme is introduced in a top-down pathway to achieve a hybrid dense fusion of multiscale features and weights of different features are learned adaptively according to their respective contributions to promote the network convergence. Meanwhile, a bottom-up pathway further complements the features with more location clues. In this way, high-level semantic information and low-level location information are tightly integrated. Furthermore, an atrous-residual structure is added to the mask prototype branch of instance prediction to capture more contextual information. This contributes to the generation of high-quality masks. The experimental results indicate that the proposed method achieves effective segmentation and the outputted masks match the contours of objects.

  • Research Article
  • Cite Count Icon 11
  • 10.1108/ir-12-2019-0259
Deep instance segmentation and 6D object pose estimation in cluttered scenes for robotic autonomous grasping
  • Apr 20, 2020
  • Industrial Robot: the international journal of robotics research and application
  • Yongxiang Wu + 2 more

Purpose This paper aims to design a deep neural network for object instance segmentation and six-dimensional (6D) pose estimation in cluttered scenes and apply the proposed method in real-world robotic autonomous grasping of household objects. Design/methodology/approach A novel deep learning method is proposed for instance segmentation and 6D pose estimation in cluttered scenes. An iterative pose refinement network is integrated with the main network to obtain more robust final pose estimation results for robotic applications. To train the network, a technique is presented to generate abundant annotated synthetic data consisting of RGB-D images and object masks in a fast manner without any hand-labeling. For robotic grasping, the offline grasp planning based on eigengrasp planner is performed and combined with the online object pose estimation. Findings The experiments on the standard pose benchmarking data sets showed that the method achieves better pose estimation and time efficiency performance than state-of-art methods with depth-based ICP refinement. The proposed method is also evaluated on a seven DOFs Kinova Jaco robot with an Intel Realsense RGB-D camera, the grasping results illustrated that the method is accurate and robust enough for real-world robotic applications. Originality/value A novel 6D pose estimation network based on the instance segmentation framework is proposed and a neural work-based iterative pose refinement module is integrated into the method. The proposed method exhibits satisfactory pose estimation and time efficiency for the robotic grasping.

  • Research Article
  • 10.1038/s41598-026-40858-z
Edge-guided multi-scale instance segmentation for railway track
  • Feb 24, 2026
  • Scientific Reports
  • Junting Lin + 2 more

Efficient and accurate railway track segmentation is vital for autonomous train operation and early obstacle detection. However, blurred boundaries and complex backgrounds pose challenges for traditional instance segmentation models. In order to address the aforementioned issues, a novel instance segmentation framework, SMDE-YOLO, which builds upon YOLO11n-seg and is designed for railway track instance segmentation in complex environments. Considering the segmentation challenges posed by blurred track boundaries, a Scharr-based edge enhancement strategy is incorporated into the data augmentation phase to enhance edge feature expression. The network incorporates the Dynamic Multi-Branch Feature Pyramid Network (DMBFPN) to enhance multi-scale feature modeling and integrates a Multi-Scale Edge Feature Fusion (MSEFF) module to strengthen edge-guided feature aggregation. In order to balance lightweight design and segmentation accuracy, the Dual Enhanced Efficient Decoupled segmentation head (DEED-Seg) is also introduced to reduce computational load. Experimental results show that SMDE-YOLO delivers superior results on the Railsem7750 dataset. Accurate track localization remains vital for intelligent inspection and safety zoning in complex traffic environments.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.media.2025.103471
SAF-IS: A spatial annotation free framework for instance segmentation of surgical tools.
  • Apr 1, 2025
  • Medical image analysis
  • Luca Sestini + 4 more

Instance segmentation of surgical instruments is a long-standing research problem, crucial for the development of many applications for computer-assisted surgery. This problem is commonly tackled via fully-supervised training of deep learning models, requiring expensive pixel-level annotations to train. In this work, we develop a framework for instance segmentation not relying on spatial annotations for training. Instead, our solution only requires binary tool masks, obtainable using recent unsupervised approaches, and tool presence labels, freely obtainable in robot-assisted surgery. Based on the binary mask information, our solution learns to extract individual tool instances from single frames, and to encode each instance into a compact vector representation, capturing its semantic features. Such representations guide the automatic selection of a tiny number of instances (8 only in our experiments), displayed to a human operator for tool-type labelling. The gathered information is finally used to match each training instance with a tool presence label, providing an effective supervision signal to train a tool instance classifier. We validate our framework on the EndoVis 2017 and 2018 segmentation datasets. We provide results using binary masks obtained either by manual annotation or as predictions of an unsupervised binary segmentation model. The latter solution yields an instance segmentation approach completely free from spatial annotations, outperforming several state-of-the-art fully-supervised segmentation approaches.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.3390/rs15030549
Multi-Swin Mask Transformer for Instance Segmentation of Agricultural Field Extraction
  • Jan 17, 2023
  • Remote Sensing
  • Bo Zhong + 7 more

With the rapid development of digital intelligent agriculture, the accurate extraction of field information from remote sensing imagery to guide agricultural planning has become an important issue. In order to better extract fields, we analyze the scale characteristics of agricultural fields and incorporate the multi-scale idea into a Transformer. We subsequently propose an improved deep learning method named the Multi-Swin Mask Transformer (MSMTransformer), which is based on Mask2Former (an end-to-end instance segmentation framework). In order to prove the capability and effectiveness of our method, the iFLYTEK Challenge 2021 Cultivated Land Extraction competition dataset is used and the results are compared with Mask R-CNN, HTC, Mask2Former, etc. The experimental results show that the network has excellent performance, achieving a bbox_AP50 score of 0.749 and a segm_AP50 score of 0.758. Through comparative experiments, it is shown that the MSMTransformer network achieves the optimal values in all the COCO segmentation indexes, and can effectively alleviate the overlapping problem caused by the end-to-end instance segmentation network in dense scenes.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant