Abstract

Semantic image segmentation approaches based on convolutional neural networks require a large amount of pixel-level training data, but the labeling process is time consuming and laborious. In this paper, we propose a semi-supervised semantic segmentation method that can leverage unlabeled data in model training to alleviate the task of labeling. A novel GAN framework comprised of a generator network and a dual discriminator network is proposed, and the entire network is trained by coupling the standard multi-class cross entropy loss with the adversarial loss. To further improve the localization of object boundaries, a self-attention layer is added to the generator network to model long-range dependencies in images, and a skip layer is also added to combine deep layer with highly abstract information and shallow layer with detailed appearance information. The dual discriminator network includes a fully convolutional discriminator and a typical GAN discriminator, so that the input image can be discriminated on both pixel level and image level. For semi-supervised semantic segmentation, the predicted segmentation results of unlabeled images are selected by image-level discriminator, and then their trustworthy regions are generated by pixel-level discriminator to provide additional supervisory signals. Extensive experiments on PASCAL VOC 2012 dataset demonstrate that our approach outperforms existing semi-supervised semantic image segmentation methods on accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call