Abstract

The recent advent of in-context learning (ICL) capabilities in large pre-trained models has yielded significant advancements in the generalization of segmentation models. By supplying domain-specific image-mask pairs, the ICL model can be effectively guided to produce optimal segmentation outcomes, eliminating the necessity for model fine-tuning or interactive prompting. However, current existing ICL-based segmentation models exhibit significant limitations when applied to medical segmentation datasets with substantial diversity. To address this issue, we propose a dual similarity checkup approach to guarantee the effectiveness of selected in-context samples so that their guidance can be maximally leveraged during inference. We first employ large pre-trained vision models for extracting strong semantic representations from input images and constructing a feature embedding memory bank for semantic similarity checkup during inference. Assuring the similarity in the input semantic space, we then minimize the discrepancy in the mask appearance distribution between the support set and the estimated mask appearance prior through similarity-weighted sampling and augmentation. We validate our proposed dual similarity checkup approach on eight publicly available medical segmentation datasets, and extensive experimental results demonstrate that our proposed method significantly improves the performance metrics of existing ICL-based segmentation models, particularly when applied to medical image datasets characterized by substantial diversity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call