Abstract

Self-driving cars leverage on semantic segmentation to understand an urban scene. However, it is costly to collect segmentation labels, thus, synthetic datasets are used to train segmentation models. Unfortunately, the synthetic to real domain shift causes these models to perform poorly. Prior works use adversarial training to align features of both synthetic and real-world images. We observe that background objects tend to be similar across domains, while foreground objects tend to have more variations. Using this insight, we propose an adaptation method that uses foreground and background cues and adapt them separately. We also propose a mask-aware gated discriminator that learns soft masks from the input foreground and background masks instead of naively performing binary masking that immediately removes information outside of the predicted masks. We evaluate our method on two different datasets and show that our method outperforms several state-of-the-art baselines, which verifies the effectiveness of our approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.