Abstract
Training a fully supervised semantic segmentation network requires a large amount of expensive pixel-level annotations in manual labor. In this work, we focus on studying the semantic segmentation problem using only image-level supervision. An effective scheme for weakly supervised segmentation is employed to produce the proxy annotations via image tags firstly. Then the segmentation network is retrained on the generated noisy proxy annotations. However, learning from noisy annotations is risky, as proxy annotations of poor quality may deteriorate the performance of the baseline segmentation and classification networks. In order to train the segmentation network using noisy annotations more effectively, two novel loss functions are proposed in this paper, namely, the selection loss and attention loss. Firstly, a selection loss is designed by weighting the proxy annotations based on a coarse-to-fine strategy for evaluating the quality of segmentation masks. Secondly, an attention loss taking the clean image tags as supervision is utilized to correct the classification errors caused by ambiguous pixel-level labels. Finally, we propose an end-to-end semantic segmentation network SAL-Net guided by the above two losses. From the extensive experiments conducted on PASCAL VOC 2012 dataset, SAL-Net reaches state-of-the-art performance with mean IoU (mIoU) as 62.5% and 66.6% on the test set by taking VGG16 network and ResNet101 network as the baselines respectively, which demonstrates the superiority of the proposed algorithm over eight representative weakly supervised segmentation methods. The code and models are available at https://github.com/zmbhou/SALTMM.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.