Abstract

Semi-supervised semantic segmentation needs rich and robust supervision for unlabeled data. However, promoting or punishing feature similarities with vanilla contrastive learning can be unreliable for semi-supervised semantic segmentation: pixel pairs are assigned as either positive or negative based on noisy pseudo labels, and both reliable and wrongly-assigned pairs receive uniform penalties. To address this issue, we propose correlation consistency learning, which leverages rich pairwise relationships in self-correlation matrices and matches them to the similarities between soft pseudo labels to provide robust supervision. Unlike vanilla contrastive learning, our approach prioritizes pairs with highly confident pseudo labels and applies weaker penalties for pairs that are less confident. We also introduce a strong semi-supervised learning pipeline that applies data augmentation in a view-coherent manner: even under complex augmentation strategies, for each pixel, a match can be found in different augmentation views. The novelties of the proposed method are the correlation consistency loss and the view-coherent data augmentation, and their combination gives us the view-coherent correlation consistency (VC3) system, which achieves state-of-the-art results in several semi-supervised settings on two datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call