Abstract

Multi-Source Domain Adaptation (MSDA) aims at training a classification model that achieves small target error, by leveraging labeled data from multiple source domains and unlabeled data from a target domain. The source and target domains are described by related but different joint distributions, which lie on a Riemannian manifold named the statistical manifold. In this paper, we characterize the joint distribution difference by the Hellinger distance, which bears strong connection to the Riemannian metric defined on the statistical manifold. We show that the target error of a neural network classification model is upper bounded by the average source error of the model and the average Hellinger distance, i.e., the average of multiple Hellinger distances between the source and target joint distributions in the network representation space. Motivated by the error bound, we introduce Riemannian Representation Learning (RRL): An approach that trains the network model by minimizing (i) the average empirical Hellinger distance with respect to the representation function, and (ii) the average empirical source error with respect to the network model. Specifically, we derive the average empirical Hellinger distance by constructing and solving unconstrained convex optimization problems whose global optimal solutions are easy to find. With the network model trained, we expect it to achieve small error in the target domain. Our experimental results on several image datasets demonstrate that the proposed RRL approach is statistically better than the comparison methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call