Abstract

The goal of Multi-Source Domain Adaptation (MSDA) is to train a model (e.g., neural network) with minimal target loss, utilizing training data from multiple source domains (source joint distributions) and a target domain (target joint distribution). The challenge in this problem is that the multiple source joint distributions are different from the target joint distribution. In this paper, we develop a theory that shows a neural network’s target loss is upper bounded by both its source mixture loss (i.e., the loss concerning the source mixture joint distribution) and the Pearson χ2 divergence between the source mixture joint distribution and the target joint distribution. Here, the source mixture joint distribution is the mixture of multiple source joint distributions with mixing weights. Accordingly, we propose an algorithm that optimizes both the mixing weights and the neural network to minimize the estimated source mixture loss and the estimated Pearson χ2 divergence. To estimate the Pearson χ2 divergence, we rewrite it as the maximal value of a quadratic functional, exploit a linear-in-parameter function as the functional’s input, and solve the resultant optimization problem with an analytic solution. This analytic solution allows us to explicitly express the estimated divergence as a loss of the mixing weights and the network’s feature extractor. Finally, we conduct experiments on popular image classification datasets, and the results show that our algorithm statistically outperforms the comparison algorithms. PyTorch code is available at https://github.com/sentaochen/Mixture-of-Joint-Distributions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call