Domain Adaptive Semantic Segmentation via Image Translation and Representation Alignment

Bin Zang,Jingxuan Kang,Weipeng Cao

doi:10.1109/ispa-bdcloud-socialcom-sustaincom52081.2021.00076

Bin Zang, Jingxuan Kang + Show 1 more

https://doi.org/10.1109/ispa-bdcloud-socialcom-sustaincom52081.2021.00076

Copy DOI

Abstract

Domain Adaptation for semantic segmentation is of vital significance since it enables effective knowledge transfer from a labeled source domain (i.e., synthetic data) to an unlabeled target domain (i.e., real images), where no effort is devoted to annotating target samples. Prior domain adaptation methods are mainly based on image-to-image translation model to minimize differences in image conditions between source and target domain. However, there is no guarantee that feature representations from different classes in the target domain can be well separated, resulting in poor discriminative representation. In this paper, we propose a unified learning pipeline, called Image Translation and Representation Alignment (ITRA), for domain adaptation of segmentation. Specifically, it firstly aligns an image in the source domain with a reference image in the target domain using image style transfer technique (e.g., CycleGAN) and then a novel pixel-centroid triplet loss is designed to explicitly minimize the intra-class feature variance as well as maximize the inter-class feature margin. When the style transfer is finished by the former step, the latter one is easy to learn and further decreases the domain shift. Extensive experiments demonstrate that the proposed pipeline facilitates both image translation and representation alignment and significantly outperforms previous methods in both GTA5 → Cityscapes and SYNTHIA → Cityscapes scenarios.

Full Text