Abstract

Unlabeled data is often used to improve the generalization ability of one segmentation model. However, it tends to neglect the inherent difficulty of unlabeled samples, and then produces inaccurate pseudo masks in some unseen scenes, resulting in severe confirmation bias and potential performance degradation. These motivate two unexplored questions for new-coming data: (1) How many images do we need to annotate; and (2) how to annotate them? In this paper, two kinds of shadow detectors ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., SDTR and SDTR+) based on the Transformer and self-training scheme are successively proposed. The main difference between them is whether weak annotations are required for partial unlabeled data. Specifically, in SDTR, we first introduce an image-level sample selection scheme to separate the unlabeled data into reliable and unreliable samples from the holistic prediction-level stability. Then, we perform selective retraining to exploit the unlabeled images progressively in a curriculum learning manner. While in SDTR+, we further provide various weak labels ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., point, box and scribble) for the rest unreliable samples and design corresponding loss functions. By doing this, it can achieve a better trade-off between performance improvement and annotation cost. Experimental results on public benchmarks ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., SBU, UCF and ISTD) show that both SDTR and SDTR+ can be favorable against state-of-the-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call