Abstract

Recently, Semi-supervised Domain Adaptation (SSDA) has become more practical because a small number of labeled target samples can significantly boost the empirical target performance when using SSDA. Several current methods focus on prototype-based alignment to achieve cross-domain invariance in which the labeled samples from the source and target domains are concatenated to estimate the prototypes. The model is then trained to assign the unlabeled target data to the prototype within the same class. However, such methods fail to exploit the advantage of using few labeled target data because the labeled source data dominate the prototypes in the supervision process. Moreover, a recent method [1] showed that concatenating source and target samples for training can damage the semantic information of representations, which degrades the trained model’s ability to generate discriminative features. To solve these problems, in this paper, we divide labeled source and target samples into two subgroups for training. One group includes a large number of labeled source samples, and the other obtains a few labeled target samples. Then, we propose a novel SSDA framework that consists of two models. A model trained on the group that has the labeled source samples to provide an “inter-view” on the unlabeled target data is called the inter-view model. A model trained on a few labeled target samples that provides an “intra-view” of the unlabeled target data is called the intra-view model. Finally, both of these models collaborate to fully exploit information on the unlabeled target data. To the best of our knowledge, our proposed method achieves the state-of-the-art classification performance of SSDA in extensive experiments conducted on several visual benchmark domain adaptation datasets that utilize the advantages of multiple views and collaborative training.

Highlights

  • With large-scale labeled data samples and the growth of computing power, supervised learning methods have shown empirical results in various computer vision applications such as image classification [2]–[4], image semantic segmentation [5]–[7], and object detection [8]–[10]

  • Results on the DomainNet dataset: The mean classification accuracy of our method achieved the best performance on the DomainNet dataset in both cases using Alexnet and ResNet34 as the backbone network

  • Using ResNet34 as the backbone network, the proposed method achieved the best accuracy of the target domain in all tasks and surpassed the current best results obtained by CDAC [22] by 2.6% and 2.1% in the 1-shot and 3-shot settings, respectively

Read more

Summary

Introduction

With large-scale labeled data samples and the growth of computing power, supervised learning methods have shown empirical results in various computer vision applications such as image classification [2]–[4], image semantic segmentation [5]–[7], and object detection [8]–[10]. These methods assumed that the training and the test sets come from the same distribution; the training data (a source domain) and test data (a target domain) in most real-world applications are related but follow different distributions. Depending on the availability of labeled target samples during training, DA can be categorized as unsupervised domain adaptation (UDA) [11]–[16] or semi-supervised domain adaptation (SSDA) [17]–[25]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call