Unsupervised cross domain semantic segmentation recently has gained much attention, due to its powerful ability of solving the segmentation problem on unlabeled domains. Traditional methods often employ an adversarial network to confuse the source and target inputs, so as to align them in a new feature space. However, these methods cannot well fuse the source and target domain information because the information of the two domains does not really interact with each other and are passed through separate network branches. To tackle this problem, we propose a real interactive learning framework, named Mutual Refinement and Information Distillation (MURID), to align the two domains. Concretely, MURID introduces a Mutual Refinement module in shallow network layers to enhance information sharing and integration between the source and target domains, which can effectively transfer knowledge from source domain to target domain. In addition, in order to avoid using the same structure for testing as for training, which would result in huge computational requirements, we exploit an Information Distillation module to simplify the testing network while maintaining the powerful inference capability of the training. Moreover, we incorporate Curriculum Learning, a self-training mechanism that iteratively trains the network using pseudo-labels obtained from the target domain, to further improve performance. Extensive experiments were conducted on three popular datasets, i.e., GTA5→Cityscapes and Synthia→Cityscapes, and the results demonstrate the state-of-the-art performance of our method. Additionally, detailed analysis and ablation studies are also carried out to validate the effectiveness of each designed module.
Read full abstract