MS-BFIRNet: Fine-grained Background Injection and Foreground Reconstruction with multi-supervision for few-shot segmentation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

MS-BFIRNet: Fine-grained Background Injection and Foreground Reconstruction with multi-supervision for few-shot segmentation

Similar Papers
  • Research Article
  • Cite Count Icon 109
  • 10.1109/tpami.2023.3265865
Base and Meta: A New Perspective on Few-Shot Segmentation.
  • Sep 1, 2023
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Chunbo Lang + 4 more

Despite the progress made by few-shot segmentation (FSS) in low-data regimes, the generalization capability of most previous works could be fragile when countering hard query samples with seen-class objects. This paper proposes a fresh and powerful scheme to tackle such an intractable bias problem, dubbed base and meta (BAM). Concretely, we apply an auxiliary branch (base learner) to the conventional FSS framework (meta learner) to explicitly identify base-class objects, i.e., the regions that do not need to be segmented. Then, the coarse results output by these two learners in parallel are adaptively integrated to derive accurate segmentation predictions. Considering the sensitivity of meta learner, we further introduce adjustment factors to estimate the scene differences between support and query image pairs from both style and appearance perspectives, so as to facilitate the model ensemble forecasting. The remarkable performance gains on standard benchmarks (PASCAL-5 i, COCO-20 i, and FSS-1000) manifest the effectiveness, and surprisingly, our versatile scheme sets new state-of-the-arts even with two plain learners. Furthermore, in light of its unique nature, we also discuss several more practical but challenging extensions, including generalized FSS, 3D point cloud FSS, class-agnostic FSS, cross-domain FSS, weak-label FSS, and zero-shot segmentation. Our source code is available at https://github.com/chunbolang/BAM.

  • Research Article
  • Cite Count Icon 92
  • 10.1109/tmi.2021.3060551
Interactive Few-Shot Learning: Limited Supervision, Better Medical Image Segmentation.
  • Feb 19, 2021
  • IEEE Transactions on Medical Imaging
  • Ruiwei Feng + 6 more

Many known supervised deep learning methods for medical image segmentation suffer an expensive burden of data annotation for model training. Recently, few-shot segmentation methods were proposed to alleviate this burden, but such methods often showed poor adaptability to the target tasks. By prudently introducing interactive learning into the few-shot learning strategy, we develop a novel few-shot segmentation approach called Interactive Few-shot Learning (IFSL), which not only addresses the annotation burden of medical image segmentation models but also tackles the common issues of the known few-shot segmentation methods. First, we design a new few-shot segmentation structure, called Medical Prior-based Few-shot Learning Network (MPrNet), which uses only a few annotated samples (e.g., 10 samples) as support images to guide the segmentation of query images without any pre-training. Then, we propose an Interactive Learning-based Test Time Optimization Algorithm (IL-TTOA) to strengthen our MPrNet on the fly for the target task in an interactive fashion. To our best knowledge, our IFSL approach is the first to allow few-shot segmentation models to be optimized and strengthened on the target tasks in an interactive and controllable manner. Experiments on four few-shot segmentation tasks show that our IFSL approach outperforms the state-of-the-art methods by more than 20% in the DSC metric. Specifically, the interactive optimization algorithm (IL-TTOA) further contributes ~10% DSC improvement for the few-shot segmentation models.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/vcip47243.2019.8965780
A New Few-shot Segmentation Network Based on Class Representation
  • Dec 1, 2019
  • Yuwei Yang + 4 more

This paper studies few-shot segmentation, which is a task of predicting foreground mask of unseen classes by a few of annotations only, aided by a set of rich annotations already existed. The existing methods mainly focus the task on "\textit{how to transfer segmentation cues from support images (labeled images) to query images (unlabeled images)}", and try to learn efficient and general transfer module that can be easily extended to unseen classes. However, it is proved to be a challenging task to learn the transfer module that is general to various classes. This paper solves few-shot segmentation in a new perspective of "\textit{how to represent unseen classes by existing classes}", and formulates few-shot segmentation as the representation process that represents unseen classes (in terms of forming the foreground prior) by existing classes precisely. Based on such idea, we propose a new class representation based few-shot segmentation framework, which firstly generates class activation map of unseen class based on the knowledge of existing classes, and then uses the map as foreground probability map to extract the foregrounds from query image. A new two-branch based few-shot segmentation network is proposed. Moreover, a new CAM generation module that extracts the CAM of unseen classes rather than the classical training classes is raised. We validate the effectiveness of our method on Pascal VOC 2012 dataset, the value FB-IoU of one-shot and five-shot arrives at 69.2\% and 70.1\% respectively, which outperforms the state-of-the-art method.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.image.2024.117186
Prototype-wise self-knowledge distillation for few-shot segmentation
  • Aug 21, 2024
  • Signal Processing: Image Communication
  • Yadang Chen + 3 more

Prototype-wise self-knowledge distillation for few-shot segmentation

  • Research Article
  • Cite Count Icon 8
  • 10.1609/aaai.v38i6.28355
Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement
  • Mar 24, 2024
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Jing Wang + 5 more

The Few-Shot Segmentation (FSS) aims to accomplish the novel class segmentation task with a few annotated images. Current FSS research based on meta-learning focuses on designing a complex interaction mechanism between the query and support feature. However, unlike humans who can rapidly learn new things from limited samples, the existing approach relies solely on fixed feature matching to tackle new tasks, lacking adaptability. In this paper, we propose a novel framework based on the adapter mechanism, namely Adaptive FSS, which can efficiently adapt the existing FSS model to the novel classes. In detail, we design the Prototype Adaptive Module (PAM), which utilizes accurate category information provided by the support set to derive class prototypes, enhancing class-specific information in the multi-stage representation. In addition, our approach is compatible with diverse FSS methods with different backbones by simply inserting PAM between the layers of the encoder. Experiments demonstrate that our method effectively improves the performance of the FSS models (e.g., MSANet, HDMNet, FPTrans, and DCAMA) and achieves new state-of-the-art (SOTA) results (i.e., 72.4% and 79.1% mIoU on PASCAL-5i 1-shot and 5-shot settings, 52.7% and 60.0% mIoU on COCO-20i 1-shot and 5-shot settings). Our code is available at https://github.com/jingw193/AdaptiveFSS.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.patcog.2023.110202
A learnable support selection scheme for boosting few-shot segmentation
  • Dec 13, 2023
  • Pattern Recognition
  • Wenxuan Shao + 2 more

A learnable support selection scheme for boosting few-shot segmentation

  • Research Article
  • Cite Count Icon 22
  • 10.1007/s11263-022-01677-7
CRCNet: Few-Shot Segmentation with Cross-Reference and Region–Global Conditional Networks
  • Sep 30, 2022
  • International Journal of Computer Vision
  • Weide Liu + 3 more

Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. In this paper, we propose a Cross-Reference and Local–Global Conditional Networks (CRCNet) for few-shot segmentation. Unlike previous works that only predict the query image’s mask, our proposed model concurrently makes predictions for both the support image and the query image. Our network can better find the co-occurrent objects in the two images with a cross-reference mechanism, thus helping the few-shot segmentation task. To further improve feature comparison, we develop a local-global conditional module to capture both global and local relations. We also develop a mask refinement module to refine the prediction of the foreground regions recurrently. Experiments on the PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art performance.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tmm.2022.3215896
Rethinking and Improving Few-Shot Segmentation From a Contour-Aware Perspective
  • Jan 1, 2023
  • IEEE Transactions on Multimedia
  • Weimin Tan + 4 more

Existing few-shot segmentation approaches basically adopt the idea of comparing the semantic prototype vector of the query image and support images, and then obtaining the segmentation result. However, recent studies have shown that a single feature vector in feature map cannot accurately represent pixel-level categories, thus leading to poor segmentation of object boundary and semantic ambiguity. To address this common problem, we propose a novel contour-aware network (CTANet) for few-shot segmentation in this paper. Unlike the usual practice of classifying each pixel separately, CTANet regards all pixels within the same contour as a whole, which can take advantage of the internal consistency of objects to obtain a more accurate representation of category information. To obtain the accurate object contour, our network consists of a contour generation module and a contour refinement module, where the former exploits multiple levels of features to generate a primary contour map and the latter learns to refine the primary contour map. Furthermore, a novel contour-aware mixed loss is proposed to fuse the common BCE loss and our contour-aware loss to supervise the training process on two levels, pixel-level and contour-level. Extensive experiments demonstrate that our CTANet achieves a new state-of-the-art performance on <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$ \text{PASCAL-5}^{i}$</tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$ \text{COCO-20}^{i}$</tex-math></inline-formula> . Hopefully, our new perspective could provide more clues for future research on few-shot segmentation. Our code is freely available at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/hardtogetA/CTANet</uri> .

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.eswa.2024.125377
Combining hierarchical sparse representation with adaptive prompt for few-shot segmentation
  • Sep 20, 2024
  • Expert Systems With Applications
  • Xiaoliu Luo + 5 more

Combining hierarchical sparse representation with adaptive prompt for few-shot segmentation

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/info12100406
PFMNet: Few-Shot Segmentation with Query Feature Enhancement and Multi-Scale Feature Matching
  • Sep 30, 2021
  • Information
  • Jingyao Li + 5 more

The datasets in the latest semantic segmentation model often need to be manually labeled for each pixel, which is time-consuming and requires much effort. General models are unable to make better predictions, for new categories of information that have never been seen before, than the few-shot segmentation that has emerged. However, the few-shot segmentation is still faced up with two challenges. One is the inadequate exploration of semantic information conveyed in the high-level features, and the other is the inconsistency of segmenting objects at different scales. To solve these two problems, we have proposed a prior feature matching network (PFMNet). It includes two novel modules: (1) the Query Feature Enhancement Module (QFEM), which makes full use of the high-level semantic information in the support set to enhance the query feature, and (2) the multi-scale feature matching module (MSFMM), which increases the matching probability of multi-scales of objects. Our method achieves an intersection over union average score of 61.3% for one-shot segmentation and 63.4% for five-shot segmentation, which surpasses the state-of-the-art results by 0.5% and 1.5%, respectively.

  • Conference Article
  • Cite Count Icon 190
  • 10.1109/cvpr42600.2020.00422
CRNet: Cross-Reference Networks for Few-Shot Segmentation
  • Jun 1, 2020
  • Weide Liu + 3 more

Over the past few years, state-of-the-art image segmentation algorithms are based on deep convolutional neural networks. To render a deep network with the ability to understand a concept, humans need to collect a large amount of pixel-level annotated data to train the models, which is time-consuming and tedious. Recently, few-shot segmentation is proposed to solve this problem. Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. In this paper, we propose a cross-reference network (CRNet) for few-shot segmentation. Unlike previous works which only predict the mask in the query image, our proposed model concurrently make predictions for both the support image and the query image. With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images, thus helping the few-shot segmentation task. We also develop a mask refinement module to recurrently refine the prediction of the foreground regions. For the $k$-shot learning, we propose to finetune parts of networks to take advantage of multiple labeled support images. Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.

  • Research Article
  • 10.1109/tpami.2025.3593816
Bridge the Intra-Class Gap: K-Shot Multi-Scale Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation.
  • Dec 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Yuanwei Liu + 6 more

Few-shot segmentation (FSS) aims to accurately segment target objects in a query image using only a limited number of annotated support images. Existing approaches typically follow a paradigm that directly leverages category information from the support set to identify target objects in the query. However, these methods often ignore the category information gap between query and support images, leading to suboptimal performance when faced with images containing objects exhibiting significant intra-class diversity. To address this issue, we propose a novel framework that introduces intermediate prototypes to capture both deterministic information from the support images and adaptive knowledge from the query at multiple scales. Our framework, named the K-shot Multi-scale Intermediate Prototype Mining Transformer (KMIPMT), is based on the Transformer architecture and learns intermediate prototypes in an iterative manner, where each KMIPMT layer propagates category information from both K-shot support features and multi-scale query features to intermediate prototypes. This information is then utilized to activate the query feature map. Through repeated iterations, both intermediate prototypes and the query feature are progressively enhanced, and the final refined query feature is used for generating precise segmentation predictions. Despite its simplicity, our method achieves remarkable performance gains on standard benchmarks, including PASCAL-$5^{i}$5i, COCO-$20^{i}$20i, and FSS-1000, setting new state-of-the-art results. Furthermore, we explore several practical and challenging extensions of our method, including 3D point cloud FSS, zero-shot segmentation, weak-label FSS, and cross-domain FSS. These extensions showcase the versatility and effectiveness of our proposed KMIPMT framework across different domains and scenarios.

  • Conference Article
  • Cite Count Icon 71
  • 10.1109/cvpr52688.2022.01127
Generalized Few-shot Semantic Segmentation
  • Jun 1, 2022
  • Zhuotao Tian + 6 more

Training semantic segmentation models requires a large amount of finely annotated data, making it hard to quickly adapt to novel classes not satisfying this condition. Few-Shot Segmentation (FS-Seg) tackles this problem with many constraints. In this paper, we introduce a new benchmark, called Generalized Few-Shot Semantic Segmentation (GFS-Seg), to analyze the generalization ability of simultaneously segmenting the novel categories with very few examples and the base categories with sufficient examples. It is the first study showing that previous representative state-of-the-art FS-Seg methods fall short in GFS-Seg and the performance discrepancy mainly comes from the constrained setting of FS-Seg. To make GFS-Seg tractable, we set up a GFS-Seg baseline that achieves decent performance without structural change on the original model. Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image. Both two contributions are experimentally shown to have substantial practical merit. Extensive experiments on Pascal-VOC and COCO manifest the effectiveness of CAPL, and CAPL generalizes well to FS-Seg by achieving competitive performance. Code is available at https://github.com/dvlab-research/GFS-Seg.

  • Research Article
  • Cite Count Icon 6
  • 10.1109/tmi.2023.3258069
Robust Prototypical Few-Shot Organ Segmentation with Regularized Neural-ODEs.
  • Sep 1, 2023
  • IEEE Transactions on Medical Imaging
  • Prashant Pandey + 3 more

Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small amount of annotation is needed for generalisation to novel classes. This is especially seen in medical domains where dense pixel-level annotations are expensive to obtain. In this paper, we propose Regularized Prototypical Neural Ordinary Differential Equation (R-PNODE), a method that leverages intrinsic properties of Neural-ODEs, assisted and enhanced by additional cluster and consistency losses to perform Few-Shot Segmentation (FSS) of organs. R-PNODE constrains support and query features from the same classes to lie closer in the representation space thereby improving the performance over the existing Convolutional Neural Network (CNN) based FSS methods. We further demonstrate that while many existing Deep CNN-based methods tend to be extremely vulnerable to adversarial attacks, R-PNODE exhibits increased adversarial robustness for a wide array of these attacks. We experiment with three publicly available multi-organ segmentation datasets in both in-domain and cross-domain FSS settings to demonstrate the efficacy of our method. In addition, we perform experiments with seven commonly used adversarial attacks in various settings to demonstrate R-PNODE's robustness. R-PNODE outperforms the baselines for FSS by significant margins and also shows superior performance for a wide array of attacks varying in intensity and design.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tpami.2024.3461779
Prompt-and-Transfer: Dynamic Class-Aware Enhancement for Few-Shot Segmentation.
  • Jan 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Hanbo Bi + 7 more

For more efficient generalization to unseen domains (classes), most Few-shot Segmentation (FSS) would directly exploit pre-trained encoders and only fine-tune the decoder, especially in the current era of large models. However, such fixed feature encoders tend to be class-agnostic, inevitably activating objects that are irrelevant to the target class. In contrast, humans can effortlessly focus on specific objects in the line of sight. This paper mimics the visual perception pattern of human beings and proposes a novel and powerful prompt-driven scheme, called "Prompt and Transfer" (PAT), which constructs a dynamic class-aware prompting paradigm to tune the encoder for focusing on the interested object (target class) in the current task. Three key points are elaborated to enhance the prompting: 1) Cross-modal linguistic information is introduced to initialize prompts for each task. 2) Semantic Prompt Transfer (SPT) that precisely transfers the class-specific semantics within the images to prompts. 3) Part Mask Generator (PMG) that works in conjunction with SPT to adaptively generate different but complementary part prompts for different individuals. Surprisingly, PAT achieves competitive performance on 4 different tasks including standard FSS, Cross-domain FSS (e.g., CV, medical, and remote sensing domains), Weak-label FSS, and Zero-shot Segmentation, setting new state-of-the-arts on 11 benchmarks.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon