Abstract

Semantic segmentation has been widely investigated in the community, in which state-of-the-art techniques are based on supervised models. Those models have reported unprecedented performance at the cost of requiring a large set of high quality segmentation masks for training. Obtaining such annotations is highly expensive and time consuming, in particular, in semantic segmentation where pixel-level annotations are required. In this work, we address this problem by proposing a holistic solution framed as a self-training framework for semi-supervised semantic segmentation. The key idea of our technique is the extraction of the pseudo-mask information on unlabelled data whilst enforcing segmentation consistency in a multi-task fashion. We achieve this through a three-stage solution. Firstly, a segmentation network is trained using the labelled data only and rough pseudo-masks are generated for all images. Secondly, we decrease the uncertainty of the pseudo-mask by using a multi-task model that enforces consistency and that exploits the rich statistical information of the data. Finally, the segmentation model is trained by taking into account the information of the higher quality pseudo-masks. We compare our approach against existing semi-supervised semantic segmentation methods and demonstrate state-of-the-art performance with extensive experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call